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Abstract 

Complex  real-world  systems  consist  of  collections  of  interacting  processes/events. 
These  processes  change  over  time  in  response  to  both  internal  and  external  stimuli  as  well 
as  to  the  passage  of  time.  Many  domains  such  as  real-time  systems  diagnosis,  (mechanized) 
story  understanding,  planning  and  scheduling,  and  financial  forecasting  require  the  capa¬ 
bility  to  model  complex  systems  under  a  unified  framework  to  deal  with  both  time  and 
uncertainty.  Existing  uncertainty  representations  and  existing  temporal  models  already 
provide  rich  languages  for  capturing  uncertainty  and  temporal  information,  respectively. 
Unfortunately,  these  partial  solutions  have  made  it  extremely  difficult  to  unify  time  and 
uncertainty  in  a  way  that  cleanly  and  adequately  models  the  problem  domains  at  hand. 
This  difficulty  is  compounded  by  the  practical  necessity  for  effective  and  efficient  knowl¬ 
edge  engineering  under  such  a  unified  framework.  Existing  approaches  for  integrating  time 
and  uncertainty  exhibit  serious  compromises  in  their  representations  of  either  time,  uncer¬ 
tainty,  or  both.  This  thesis  investigates  a  new  model,  the  Probabilistic  Temporal  Network, 
that  represents  temporal  information  while  fully  embracing  probabilistic  semantics.  The 
model  allows  representation  of  time  constrained  causality,  of  when  and  if  events  occur,  and 
of  the  periodic  and  recurrent  nature  of  processes. 


viii 


On  Unifying  Time  and  Uncertainty: 

The  Probabilistic  Temporal  Network 

I.  Introduction 

The  field  of  Artificial  Intelligence  is  at  a  nexus  in  its  progress  in  the  modeling  of 
human  cognition  and  in  the  performance  of  useful  tasks1.  The  critical  capability  for  passing 
through  the  nexus  is  a  single  coherent  structure  unifying  both  time  and  uncertainty.  This 
thesis  investigation  advances  the  field  of  Artificial  Intelligence  by  providing  the  requisite 
unifying  structure. 

1.1  Overview 

In  the  evolution  of  expert  systems,  many  techniques  have  been  developed  to  represent 
human  knowledge.  One  of  the  earliest  (and  still  used)  techniques  is  to  represent  knowledge 
as  a  logical  system  of  if-then  style  rules  (rule-based  systems  [5,11]).  A  more  recent  approach 
is  to  represent  knowledge  (including  uncertainty)  of  a  situation,  or  “domain,”  as  a  network 
of  states  and  probabilities  (Bayesian  Networks  [22]). 

Many  domains,  whether  they  are  rule-based,  probabilistic,  or  other,  require  a  rep¬ 
resentation  of  time  and  of  the  temporal  relationships  between  events.  Most  systems  rely 
on  a  mechanism  in  which  a  date  is  associated  with  each  piece  of  knowledge.  Relationships 
are  then  determined  simply  by  the  date  ordering.  In  more  complicated  domains,  such 
as  emergency  room  diagnosis,  the  date  mechanism  is  not  sufficient;  one  must  be  able  to 
represent  situations  with  relative  knowledge  like  “precedes”  or  “during.” 

Real-world  domains  requiring  a  unified  model  of  time  and  uncertainty  include  dealing 
with  real-time  system  diagnosis,  (mechanized)  story  understanding,  planning  and  schedul¬ 
ing,  logistics,  resource  management,  as  well  as  financial  forecasting.  For  example,  consider 
the  following  scenario  found  in  computer  security  analysis: 

1At  the  recent  Twelfth  Conference  on  Uncertainty  in  Artificial  Intelligence  [14]  (August  1996)  held  in 
Portland,  Oregon,  a  panel  of  experts  on  “UAI  by  2005:  Reflections  on  critical  problems,  directions,  and 
likely  achievements  for  the  next  decade”  identified  the  need  for  unifying  time  and  uncertainty  as  among 
the  top  priorities  necessary  for  advancing  the  entire  field  of  AI. 
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The  computer  operations  center  has  a  secure  vault  with  a  time-coded  lock.  This  time- 
lock  allows  the  vault  to  be  opened  from  0900  hours  to  0905  hours  and  from  2100  to 
2105.  The  center  has  critical  operations  from  0855  to  1805.  Access  to  the  vault  is 
needed  during  the  day  and  during  critical  operation  making  the  vault  likely  to  be  open 
at  those  times.  However,  if  the  vault  is  closed,  it  cannot  be  reopened  until  the  time-lock 
allows. 

This  scenario  provides  a  detailed  description  of  the  causal  and  temporal  relationships 
necessary  to  properly  model  the  secure  vault.  As  part  of  the  computer  security  analysis, 
one  must  be  able  to  translate  this  description  and  capture  the  knowledge  in  a  form  that 
can  be  correctly  processed  and  reasoned  with. 

Once  the  knowledge  representation  is  captured,  inferences  can  be  made.  Inferences 
can  be  of  several  types  including  prediction  and  explanation.  Prediction  is  concerned  with 
extending  forward  from  the  known  past  and  present  to  the  unknown  future  (statistical 
syllogism  [15]).  Explanation  involves  the  determination  of  causality  by  extending  from 
known  data  back  to  hypotheses  (abduction)  [15]. 

1.2  The  Problem 

Complex  systems  consist  of  collections  of  interacting  processes.  These  processes 
change  over  time  in  response  to  both  internal  and  external  stimuli  as  well  as  to  the  passage 
of  time  itself.  There  is  great  variety  in  the  behavior  of  processes.  Some  processes  are  simple 
events  such  as  opening  a  door  or  flipping  a  switch.  Others  are  complex.  For  example, 
consider  a  communication  channel  where  errors  may  occur  due  to  lightning  strikes  and 
faults  are  more  likely  to  occur  given  previous  errors.  Processes  can  also  be  recurrent  or 
periodic,  such  as  the  passing  of  day  into  night  or  shifts  in  a  work  schedule. 

What  is  needed  is  a  model  capable  of  representing  complex  systems  changing  over 
time.  Given  evidence  about  the  past  and  present  state  of  a  system,  one  must  be  able  to 
predict  the  system’s  future  state.  Also,  given  a  future  state,  one  must  be  able  to  determine 
the  most  probable  causes  for  that  state.  As  knowledge  about  such  systems  is  bound  to  be 
incomplete  and  as  the  systems  themselves  may  not  be  deterministic,  the  model  must  be 
able  to  represent  uncertainty.  This  uncertainty  permeates  all  areas:  the  duration  of  events, 
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the  strength  of  causal  influence,  the  precise  temporal  relationship  between  events,  and  so 
forth. 

1.3  Prior  Work 

Bayesian  networks  [22]  provide  a  robust,  probabilistic  method  of  reasoning  with  un¬ 
certainty.  Bayesian  networks,  however,  do  not  provide  a  direct  mechanism  for  representing 
temporal  dependencies.  For  example,  it  is  difficult  to  represent  a  situation  such  as  the 
variability  of  the  time  an  employee  arrives  at  work  and  the  causal  relationships  between 
the  time  of  arrival  and  later  events. 

Prior  temporal  modeling  techniques  have  made  trade-offs  in  expressiveness  between 
semantics  for  time  and  semantics  for  uncertainty.  Significant  research  has  been  done  ex¬ 
ploring  time  nets  (also  called  time-slice  Bayesian  networks)  [12,16,17].  These  approaches 
build  on  the  strong  probabilistic  semantics  of  Bayesian  networks  for  expressing  uncer¬ 
tainty.  The  discrete  time  net  approach  developed  by  Kanazawa  models  time  as  a  series  of 
points  [16].  Events  are  considered  to  occur  at  an  instant  of  time  while  facts  are  considered 
to  occur  over  a  series  of  time  points.  Both  events  and  facts  are  represented  by  random 
variables.  If  dependencies  only  connect  between  random  variables  at  the  same  or  consecu¬ 
tive  time  points,  then  the  net  is  said  to  be  a  Markov  time  net.  In  other  words,  the  Markov 
property  holds  for  a  model  when  the  future  is  conditionally  independent  of  the  past,  given 
the  present  [17]. 

Hanks  et  al,  [12]  is  especially  interesting  for  this  work  due  to  the  emphasis  on  both 
endogenous  and  exogenous  change  [12].  Endogenous  change  is  triggered  by  internal  action, 
such  as  the  progression  of  disease,  and  exogenous  change  is  triggered  by  external  change 
such  as  the  administration  of  drugs.  In  the  temporal  model  presented  in  this  thesis, 
individual  processes  within  a  system  must  be  able  to  respond  to  both  endogenous  (internal) 
and  exogenous  (external)  stimuli. 

The  time-sliced  approaches  mentioned  above  are  based  on  point  models  of  time  and, 
as  such,  require  that  events  occur  instantaneously.  Often  it  is  more  natural  to  consider 
events  as  taking  place  over  intervals  of  time.  Also,  the  relationships  between  events  that 


1-3 


occur  over  intervals  can  be  quite  difficult  to  represent  with  only  the  three  point  relations 
(precedes,  follows,  equals). 

Santos’  Temporal  Abduction  Problem  (TAP)  [26]  uses  an  interval  representation  of 
time.  In  the  TAP,  each  event  has  an  associated  interval  during  which  the  event  occurs. 
Relationships  between  events  are  expressed  as  directed  edges  from  cause  to  effect  within 
a  weighted  and/or  directed  acyclic  graph  structure.  Edges  are  qualified  with  the  possible 
interval  relations.  This  allows  great  flexibility  in  expressing  the  relationship  between  events. 
Fbr  example,  if  event  A  must  occur  either  before  or  after  event  B  then  the  relation  is  written 
{<,>}.  The  TAP  is  an  extension  of  cost  based  abduction  [7]  using  a  numeric  cost  to 
indicate  the  uncertainty  of  an  event’s  occurrence.  These  costs  are  generally  determined  in 
an  ad  hoc  manner  by  the  domain  expert.  The  TAP  trades  strong  semantics  of  uncertainty 
for  a  powerful  and  flexible  temporal  representation. 

L4  Thesis  Contribution 

This  thesis  investigation  presents  a  new  model,  the  Probabilistic  Temporal  Network 
(PTN),  for  representing  temporal  and  atemporal  information  while  remaining  fully  prob¬ 
abilistic.  The  model  allows  representation  of  time  constrained  causality,  of  when  and  if 
events  occur,  and  of  the  periodic  and  recurrent  nature  of  processes.  Bayesian  networks 
lie  at  the  foundation  of  the  system  and  provide  the  probabilistic  basis.  Allen’s  interval 
system  [2]  and  his  thirteen  relations  provide  the  temporal  basis. 

PTNs  focus  on  directly  modeling  processes  and  the  interaction  between  them.  The 
state  of  a  process  is  represented  by  a  value  at  a  given  time  interval.  A  process  can  be 
defined  over  any  number  of  such  intervals.  Random  variables  from  traditional  probability 
theory  are  used  to  model  a  process’  value  over  each  time  interval. 

The  next  chapter  (Chapter  II)  discusses  temporal  reasoning  and  Bayesian  networks. 
From  this  foundation,  the  theoretical  structure  and  probabilistic  nature  are  developed 
and  proven  in  Chapter  III.  A  linear  constraint  system  for  performing  belief  revision  is 
developed  in  Chapter  IV  as  well  as  a  polynomial  solvable  subclass.  Chapter  V  develops 
the  process  of  extending  an  existing  knowledge  base  into  the  temporal  domain  as  well  as 
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recommendations  for  knowledge  engineering  with  the  probabilistic  temporal  network.  The 
investigation  concludes,  in  Chapter  VI,  with  recommendations  for  further  research.  Along 
the  way,  several  examples  are  developed  including  the  secure  vault  scenario  introduced 
previously. 
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II.  Background 

To  represent  complex,  dynamic  systems,  the  probabilistic  temporal  network  draws  from 
both  temporal  and  Bayesian  reasoning.  This  chapter  introduces  the  foundations  from 
which  the  PTN  is  built.  Section  2.1  briefly  develops  temporal  reasoning,  emphasizing  the 
aspects  relevant  to  this  thesis.  Section  2.2  introduces  Bayesian  networks  from  which  the 
PTN  draws  probabilistic  semantics  for  representing  uncertainty. 

2.1  Temporal  Reasoning 

Temporal  reasoning  has  been  defined  as  the  ability  to  reason  about  the  relationships 
in  time  between  events  [11].  It  is  necessary  to  reason  about  time  in  many  domains  including 
planning,  simulation,  natural  language  understanding',  and  diagnosis.  Temporal  reasoning 
has  been  considered  in  philosophy  and  logic  since  Thales  and  Zeno  [19];  however,  it  is  only 
in  the  last  two  decades  that  temporal  reasoning  has  been  explicitly  considered  in  artificial 
intelligence.  McDermott  and  Allen,  with  their  work  in  the  early  nineteen-eighties  [2-4,20], 
brought  temporal  reasoning  into  the  AI  mainstream.  Other  models  for  temporal  reasoning 
include  point  algebras  [32],  semi-intervals  [10],  temporal  constraint  networks  [9],  and  weak 
representations  of  interval  algebras  [18]. 

McDermott  provides  one  of  the  earliest  temporal  representations  [20].  In  his  ap¬ 
proach,  time  is  divided  into  a  series  of  states  with  each  state  having  an  associated  date, 
i.e.,  point  in  time.  Facts  are  expressed  as  being  true  during  particular  states. 

Allen  introduced  interval  temporal  reasoning  to  the  AI  community  [2,  4].  Allen’s 
interval  algebra  is  governed  by  13  relations  on  the  intervals.  Each  event  has  an  associated 
interval,  denoted  [a,  6],  where  a  is  the  starting  time  point  and  b  is  the  termination  point. 
Temporal  relationships  between  events  are  expressed  as  relations  between  their  intervals. 
The  relations  between  intervals,  denoted  A ,  are  {=,<,>,  m,  mi,  d,  dz,  s,  si,  /,  /i,  o,  oi)  [2] 
(see  Table  2.1).  For  example,  event  A  =  [a,  6]  preceding  event  B  =  [c,  d]  is  denoted  A  <  B 
indicating  that  a  <  b  <  c  <  d.  These  relations  are  mutually  exclusive  and  exhaustive. 
Note  that,  while  there  are  thirteen  relations  between  intervals,  only  three  relations  exist 
between  points:  precedes,  equals,  and  follows. 
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Table  2.1  The  thirteen  possible  interval- interval  relations. 


Symbol: 

Name: 

Relation: 

= 

= 

equals 

< 

> 

precedes 

— 

m 

mi 

meets 

d 

di 

during 

s 

si 

starts 

f 

0 

fi 

oi 

finishes 

overlaps 

Of  special  importance  is  Allen’s  use  of  disjunctive  sets  to  express  uncertainty  in  the 
exact  relationship  between  intervals.  For  example,  “interval  A  precedes  or  meets  interval 
B”  is  written  as  A{<,m}B.  Some  commonly  used  disjunctions  are  disjoint ,  written 
{<,  >,ra,7Tw},  and  contains,  written  { di,si,fi }  [2].  These  relationships  between  events 
can  be  represented  in  a  graphical  form  where  nodes  represent  events  and  the  arcs  are 
labeled  with  a  disjunction  of  relations.  The  goal  is  to  determine  whether  there  exists  an 
interval  assignment  to  all  the  events  that  satisfy  the  disjunctive  relations.  If  such  a  solution 
exists,  then  the  given  knowledge  base  is  consistent. 

While  there  is  debate,  in  both  philosophy  and  artificial  intelligence,  as  to  which 
representation,  points  or  intervals,  is  most  appropriate;  the  expressive  power  of  the  two 
methods  is  generally  considered  equivalent  [2, 16]  as  intervals  can  be  represented  with 
beginning  and  end  points  in  a  point  based  approach.  Allen  points  out,  however,  some 
paradoxes  that  can  occur  when  points  are  allowed  as  the  fundamental  unit  of  time  [2]. 
The  problems  arise  from  the  durationless  nature  of  points.  Durationless  intervals  are  not 
allowed,  i.e.,  for  any  interval  [ti,  <2])  *2  >  *1-  If  —  h  is  allowed  then  the  thirteen 
interval-interval  relations  are  not  mutually  exclusive.  For  example  [tj,  t{\  starts  [<2,^3]  is 
indistinguishable  from  [*i , *2]  meets  [<2,  *3]  when  t\  =  £2-  Mathematically,  point  relations 
should  be  expressed  as  tiR\t2,t$]  and  as  such,  there  is  a  different  set  of  point-interval 
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relations  which  would  add  unnecessary  overhead  if  used  in  our  model.  Our  model  strictly 
adheres  to  the  philosophy  that  intervals  are  primitive  and  have  non-zero  duration. 
Definition  1.  A  temporal  interval  is  a  closed  interval  [a,  b]  on  the  reals 1  such  that  a  <b. 
Axiom  1.  The  temporal  interval  is  the  primitive  temporal  individual 

Since  all  intervals  must  have  non-zero  duration,  how  can  point  intervals  be  expressed? 
The  standard  approach  is  to  use  [t0,to  +  e]  where  e  is  arbitrarily  close,  but  not  equal,  to 
zero.  Note  that  e  can  either  be  added  to  the  end  or  subtracted  from  the  beginning  or  both. 
This  approach  is  adopted  in  the  PTN.  To  facilitate  specifying  the  relationships  between 
intervals,  e  is  deemed  constant  across  an  entire  model.  Thus  [to s  ^0  +  eKm}faMl]  does  not 
hold  while  [to,  to  +  e]{m}[£0  +  Ml]  does. 

Aside  from  the  temporal  domain,  neither  Allerl’s  nor  McDermott’s  method  can  ex¬ 
plicitly  model  uncertainty.  Uncertainty  arises  from  many  sources  including  missing  or 
unavailable  data  as  well  as  over  generalization  of  rules  [11].  For  example  if  we  have  the 
rule  “Birds  Fly”  and  “Ostriches  are  birds”  we  conclude  that  “Ostriches  fly.”  To  prevent 
such  a  conclusion,  additional  rules  must  be  added  such  as  “Some  birds  fly”  or  “Ostriches 
don’t  fly”  to  cover  each  exception.  These  additional  rules  can  add  significant  complexity 
to  a  knowledge  base. 

2.2  Bayesian  Networks  (BNs) 

Approaches  to  dealing  with  uncertainty  include  fuzzy  logic  [34],  cost  based  tech¬ 
niques  [7],  certainty  factors  [29,30],  Dempster-Shafer  theory  [27],  and  probabilistic  meth¬ 
ods  [22].  These  approaches  can  be  used  both  extensionally  and  intensionally.  Extensional 
systems,  such  as  rule-based  systems,  attach  some  sort  of  truth  value  to  each  rule  or  for¬ 
mula.  The  truth-value  for  formulae  are  calculated  functionally  from  the  truth-value  of 
sub-formulae.  Intensional  systems,  such  as  model-based  systems,  attach  uncertainty  to  the 
possible  states  of  the  system  itself  [22].  Extensional  systems  are  generally  computationally 
efficient  but  their  uncertainty  measures  are  semantically  weak.  Intensional  systems,  on 
the  other  hand,  are  generally  computationally  expensive  and  semantically  strong  [22].  By 

1  Temporal  intervals  can  be  defined  over  the  rational  numbers  if  countability  is  an  issue,  perhaps  in 
proving  some  property  of  the  model 
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carefully  restricting  which  parts  of  an  intensional  system  are  relevant  to  the  other  parts, 
the  computational  limitations  can,  to  some  degree,  be  overcome. 

In  probabilistic  reasoning,  random  variables  (RVs)  are  used  to  represent  events 
and/or  objects  in  the  world.  By  assigning  various  values  to  these  RVs,  we  can  model 
the  current  state  of  the  world  and  weight  the  states  according  to  the  joint  probabilities. 


Figure  2.1  “Suppose  when  I  go  home  at  night,  I  want  to  know  if  my  family  is  home 
before  I  try  the  doors.  Now  often  when  my  wife  leaves  the  house,  she  turns 
on  an  outdoor  light.  However,  she  sometimes  turns  on  this  light  if  she  is 
expecting  a  guest.  Also,  we  have  a  dog.  When  nobody  is  home,  the  dog  is 
put  in  the  back  yard.  The  same  is  true  if  the  dog  has  bowel  troubles.  Finally, 
if  the  dog  is  in  the  backyard,  I  will  probably  hear  her  barking.”  [6] 

Bayesian  networks  are  probabilistic  intensional  systems  in  which  independence  as¬ 
sumptions  are  used  to  restrict  relevance.  A  Bayesian  network  is  a  directed  acyclic  graph 
(DAG)  of  random  variable  (RV)  relationships.  Directed  arcs  between  RVs  represent  condi¬ 
tional  dependencies.  When  all  the  parents  of  a  given  RV  are  instantiated,  that  RV  is  said 
to  be  conditionally  independent  of  the  remaining,  non-descendent  RVs  given  knowledge 
of  its  parents.  For  a  more  formal  description  of  the  independence  semantics  in  Bayesian 
networks,  see  d-separation  and  I-maps  in  Charniak  [6]  and  Pearl  [22].  Figure  2.1  presents 
a  simple  example  of  a  Bayesian  network  which  demonstrates  the  nomenclature  used  in  the 
following  paragraphs. 
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In  general,  we  are  searching  for  the  world  state  with  highest  likelihood.  This  is  called 
belief  revision  [22],  Belief  revision  is  best  used  for  modeling  explanatory/diagnostic  tasks. 
Basically,  some  evidence  or  observation  is  given  to  us,  and  our  task  is  to  come  up  with  a 
set  of  hypotheses  that  together  constitute  the  most  satisfactory  explanation/interpretation 
of  the  evidence  at  hand.  Belief  revision  is  a  form  of  abductive  reasoning  [7,13,23].  More 
formally,  if  W  is  the  set  of  all  RVs  in  our  given  Bayesian  network  and  e  is  our  given 
evidence2,  any  complete  instantiation  to  all  the  RVs  in  W  that  is  consistent  with  e  is 
called  an  explanation  or  interpretation  of  e.  The  problem,  then,  is  to  find  an  explanation 
w*  such  that 

P(w*\e)  =  maxP(u;|e).  (2.1) 

w*  is  known  as  the  most-probable  explanation.  The  joint  probability  of  any  explanation  w , 

w  =  (X\  =  X\)  A  ( X2  —  X2)  A  ...  A  (Xm  =  xm)  (2.2) 

(where  X\  . . .  X{ . . .  Xm  is  an  arbitrary  ordering  of  random  variables  in  TV,  and  X{  is  some 
assignment  to  random  variable  Xf)  is  found  using  the  chain  rule  [22]: 

P{w)  —  P{xrn\X'ni—l  5  •••5^1)  *  P{p^rri"l  |^m— 2>  •  •  •  5  3T)  P{x 2  |^l)  *  P{^1^)  (2*3) 

Bayesian  networks  take  the  chain  rule  one  step  further  by  making  the  important 
observation  that  certain  RV  pairs  may  become  uncorrelated  once  information  concerning 
other  RV(s)  is  known.  More  precisely,  we  may  have  the  following  independence  condition: 

P(A\XU . .  - ,  In,  U)  =  P(A\XU  ...,*„)  (2.4) 

for  some  collection  of  RVs  U.  Intuitively,  we  can  interpret  this  as  saying  that  given  knowl¬ 
edge  of  Xi, . . . ,  Xn  knowledge  of  U  is  irrelevant  to  the  state  of  A . 

Combined  with  the  chain  rule,  these  conditional  independencies  allow  us  to  replace 
the  terms  in  the  chain  rule  with  smaller  conditionals.  Thus,  instead  of  explicitly  keeping 
the  joint  probabilities,  all  we  need  are  smaller  conditional  probability  tables,  from  which 
the  joint  probabilities  can  then  be  calculated. 

2 That  is,  e  represents  a  set  of  instantiations  made  on  a  subset  of  W. 
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For  example,  an  application  of  the  chain  rule  for  computing  the  probability  of  an 
explanation  for  the  Bayesian  network  in  Figure  2.1  is 

P(hb,do,lo,  fo,bp)  =  P(hb\do,lo,fo,bp)-P(do\lo,fo,bp)-  ^  ^ 

P(lo\fo,  bp)  ■  P(fo\bp )  ■  P{bp) 

Using  the  dependencies,  we  can  simplify  this  to 

P(hb,  do,  lo,  fo,  bp)  =  P(hb\do)  •  P(do\fo,  bp)  ■  P(lo\fo)  •  P(fo)  •  P{bp)  (2.6) 

Since  these  conditional  probabilities  needed  for  the  simplified  chain  rule  are  exactly  those 
provided  for  each  random  variable  in  the  Bayesian  network,  computation  of  joint  proba¬ 
bilities  is  straightforward. 

Bayesian  networks  [22]  are  an  intuitive  method  for  representing  uncertainty.  Bayesian 
networks,  however,  do  not  provide  a  direct  mechanism  for  representing  temporal  depen¬ 
dencies.  For  example,  it  is  difficult  to  represent  a  situation  such  as  the  variability  of  the 
time  of  an  employee’s  arrival  at  work  and  the  causal  relationships  between  the  time  of 
arrival  and  later  events. 
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III.  Theoretical  Structure 


This  chapter,  in  Section  3.1,  defines  the  formal  structure  of  the  probabilistic  temporal 
network.  The  PTN’s  ability  to  model  periodic  and  recurrent  processes  is  presented  in 
Section  3.2.  Section  3.2  also  proves  the  probabilistic  nature  of  the  PTN.  The  chapter 
concludes  with  a  discussion  of  a  closely  related  temporal  model  which  has  appeared  very 
recently  in  the  literature. 

3.1  Combining  Time  and  Probability 

As  previously  discussed  in  Chapter  I,  the  time-sliced  approaches  provide  strong  prob¬ 
abilistic  semantics  for  representing  uncertainty;  however,  they  are  constrained  in  their 
temporal  expressiveness.  The  temporal  abduction  problem,  on  the  other  hand,  has  strong 
interval-based  temporal  semantics,  but  lacks  strong  probabilistic  semantics. 

What  is  needed,  then,  is  a  combined  approach  integrating  strong  probabilistic  and 
temporal  semantics.  While  much  research  has  been  done  on  point-based  probabilistic 
temporal  network  models,  little  or  no  research  has  been  identified  using  interval  meth¬ 
ods,  specifically  Allen’s  interval  relations,  for  intensional  probabilistic  reasoning  [22].  As 
mentioned  earlier,  the  interval  representation  of  time  is  important  for  the  expressive  set  of 
relations  available.  The  closest  research  is  the  temporal  abduction  problem  discussed  above 
which  does  not  have  strict  probabilistic  semantics.  Recent  work  by  Young  and  Santos  [33] 1 
does  present  a  starting  point,  defining  the  network  structure  for  a  new  model. 

The  nodes  of  the  probabilistic  temporal  network  are  temporal  aggregates  and  the 
edges  are  the  causal/temporal  relationships  between  aggregates.  Each  aggregate  represents 
a  process  changing  over  time.  A  temporal  aggregate  contains  every  interval  of  interest  for 
the  process.  Each  interval  has  an  associated  random  variable  giving  the  state  of  the 
process  over  that  interval.  Figure  3.1  depicts  an  example  temporal  aggregate  modeling 
when  a  “vault”  is  open.  The  ‘Vault-Open’  TA  is  dependent  on  itself  ( VO )  and  two  other 
processes  ( TU  and  CO).  This  example  is  expanded  into  a  full  network  next. 

iIn  which  Probabilistic  Temporal  Networks  (PTNs)  are  termed  Temporal  Bayesian  Networks  (TBNs) 
and  Temporal  Aggregates  (TAs)  are  termed  Temporal  Random  Variables  (TRVs) 
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Vault-Open  =  {([0000,0900],ol ),  ([0900,1 200], o2), 

([ 1 200,2 1 00]  ,o3 ),  ([21 00,2400]  ,o4) } 

P(oxlVO,CO,TU)  =  0.95  ^ -  P(oxhVO,CO,TU)  =  0.98 

P(oxlVO,CO,-iTU)  =  0.80  (^Vaillt0npn  ^  P(oxhVO,CO,-iTU)  =  0.0 
P(oxlVO,-iCO,TU)  =  0.4  P  P(oxhVO,-iCO,TU)  =  0.6 

P(oxlVO,-iCO,-iTU)  =  0.4  P(oxhVO,-iCO,-iTU)  =  0.0 


Figure  3.1  A  simple  temporal  aggregate,  ‘Vault-Open,’  defined  over  four  intervals.  The 
conditional  probability  tables  show  ‘Vault-Open’  to  be  dependent  on  itself 
through  some  temporal  causal  relationship. 


As  is  the  case  in  the  real  world,  the  apparent  state  of  a  process  is  dependent  on  the 
temporal  perspective  of  observation.  An  observation  made  in  the  middle  of  the  night  as  to 
whether  or  not  someone  is  at  work  may  return  different  results  than  if  the  observation  is 
made  during  the  day.  A  switch  can  be  turned  on  only  if,  at  some  previous  time,  the  switch 
was  turned  off;  the  light  can  be  on  only  when  the  switch  is  on. 

To  model  the  effects  of  different  perspectives  on  the  apparent  state  of  a  process, 
edges  in  the  network  consist  of  a  disjunctive  set  of  interval  relations  and  a  schema  to  map 
the  random  variables  of  the  intervals  to  a  single  value.  This  allows  the  precise  selection  of 
those  intervals  during  which  the  state  of  one  process  affects  another. 

Vault-Open  =  {([0000,0900]  ,ol),  ([0900, 1200], o2), 

([1 200,21 00]  ,o3),  ([21 00,2400]  ,o4) } 

P(oxlVO,CO,TU)  =  0.95  _ _  P(oxhVO,CO,TU)  =  0.98 


P(oxlVO,CO,TU)  =  0.95 
P(oxlVO,CO,-iTU)  -  0.80  ( 

P(oxlVO,-iCO,TU)  =  0.4 
P(oxlVO,-iCO,-iTU)  =  0.4 

({shOR)/^  | 

Time-UnLock-  {([0900, 0905], 11), 
P(lx)  =  1.0  ([2100,2105],12)> 


Vault-Open 


({m},OR) 


P(oxhV  0,CO,-iTU)  -  0.0 
P(oxhVO,-iCO,TU)  =  0.6 
P(oxhVO,-iCO,iTU)  -  0.0 

\({di>,OR) 


(^Critical-Operations^) 

Critical-Operations  =  {([0855, 1805],  cl)> 
P(cx)  =  1.0 


Figure  3.2  A  probabilistic  temporal  network  modeling  a  secure  vault.  This  example 
extends  the  ‘Vault-Open’  temporal  aggregate  in  Figure  3.2.  Note  that  ‘ox,’ 
‘lx,’  and  ‘cx’  above  are  instantiated  with  o\ . . .  04,  l\ . . .  I2,  and  ci  respectively. 


3-2 


Figure  3.2  shows  a  probabilistic  temporal  network  modeling  our  secure  vault  scenario 
detailing  the  various  components  and  their  interactions.  These  network  components  are 
discussed  and  defined  in  the  following  paragraphs. 

3.1.1  Temporal  Aggregates.  A  process,  such  as  4 Vault-Open’  in  Figure  3.2,  is 
represented  in  the  PTN  by  a  temporal  aggregate.  Intuitively,  a  temporal  aggregate  consists 
of  the  set  of  states,  e.g.,  {true,  false},  {1,2,3},  or  {false}  U  {Red,  Blue},  that  the  process 
can  take  on,  and  a  set  of  temporal  intervals  each  having  an  associated  random  variable. 
Each  such  RV  has  a  conditional  probability  table  defined  over  the  states  of  the  process. 
Definition  2.  A  temporal  aggregate  (TA)  is  an  ordered  pair  (T,  E)  in  which  E  is  a  set 
of  states  and  T  (pronounced  Tauj  is  a  set  of  ordered  pairs  (i,r)  where  i  is  a  temporal 
interval  and  r  is  a  random  variable  defined  over  E.  For  all  pairs  (ii,ri)  and  (^2,^2)  i>n  T, 
n  =  T2  iff  i\  —  i*i  •  The  dependencies  for  each  random  variable  in  the  TA  are  defined  only 
by  temporal  causal  relationships  between  TAs. 

In  the  authors  prior  work  [33],  temporal  aggregates  (there  termed  temporal  random 
variables)  were  allowed  to  have  internal  dependencies  to  model  endogenous  change.  This 
was  found  to  be  a  source  of  temporal  inconsistency  and  better  represented  through  self 
loops  as  demonstrated  in  Figure  3.2.  Endogenous  change  is  explicitly  modeled  in  the  PTN 
with  cyclic  temporal  causal  relationships.  Endogenous  change  can  be  seen  in  the  ‘Vault- 
Open’  process  in  Figure  3.2  in  which  the  vault  is  more  likely  to  stay  open,  given  that  it  is 
open.  Also  note  that  this  definition  allows  T  to  contain  a  potentially  infinite  number  of 
interval-RV  pairs.  It  is  assumed  that  temporal  aggregates  are  finite,  both  in  T  and  in  E. 

‘Vault-Open’  is  formally  written,  according  to  Definition  2,  as  VO  =  {T,  E}  where 
Tvo  =  {([0000, 0900],  01 ),  ([0900, 1200],  02),  ([1200, 2100],  03),  ([2100, 2400],  o4 )}  (3.1) 

and 

Eyo  =  {true,  false}  (3.2) 
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with  the  conditional  probability  table  being 


P(ox\VO,CO,TU)  =  0.95 

P(ox\VO,CO,^TU)  =  0.80 

P(ox\VO^CO,TU)  =  0.4 

P(ox\VO^CO^TU)  =  0.4 


P(ox\^VO,CO,TU)  =  0.80 

P(ox\^VO,CO}^TU)  =  0.0 

P(ox\^VO^CO,TU)  =  0.6 

P(ox\^VO^CO^TU)  =  0.0 


for  all  RVs  ox  where  ox  E  {01,02,03,04}.  The  -»  symbol  (as  in  -TU  above)  indicates  that 
the  RV  is  assigned  false,  a  non-negated  RV  (as  in  TU)  indicates  that  the  RV  is  assigned 
true. 


Since  E  =  {true,  false},  P{^ox\VO,  CO,TU)  =  1  -  P(ox\VO,CO,TU).  This  condi¬ 
tion  holds  for  the  other  probabilities  as  well.  In  general,  the  probabilities  are  not  explicitly 
shown  when  the  probability  of  the  true  case  is  zero,  ve.g.,  P(ox\^VO^CO^TU)  =  0.0 
would  not  be  shown.  Symbols  used  for  temporal  aggregates  are  uppercase  letters  from  the 
end  of  the  alphabet,  e.g.,  X  or  V,  or  uppercase  abbreviations  from  the  text  name  of  the 
process  being  modeled,  e.g.,  process  ‘Vault-Open5  has  a  temporal  aggregate  denoted  VO. 
Random  variables  within  temporal  aggregates  are  denoted  with  lowercase  letters,  e.g.,  a, 
5,  and  c  or  y\  and  7/2*  Since  the  possible  states  of  the  aggregate  are  often  evident  from  the 
conditional  probability  tables,  E  is  often  not  explicitly  shown.  To  differentiate  between 
components  of  different  temporal  aggregates,  the  symbol  of  the  component  can  contain  the 
subscripted  symbol  of  the  associated  TA,  e.g.,  E vo  or  °i Vo- 

An  assignment  to  a  temporal  aggregate  consists  of  an  assignment  to  each  interval- RV 

pair. 

Definition  3.  A  is  an  aggregate  assignment  (A A)  iff  A  is  a  set  of  ordered  pairs  (r,  a) 
where  r  G  T  and  a  G  E  such  that  Vr  G  T,  there  exists  an  unique  a  E  E  such  that  (r,  a)  G  A. 
In  other  words ,  an  aggregate  assignment  is  a  function  from  T  into  E. 


For  example, 


Avo  = 


([0000, 0900],  false),  ([0900, 1200],  true), 
([1200, 2100],  true),  ([2100, 2400],  false) 


(3.4) 


is  an  A  A  for  the  temporal  aggregate  VO  from  Figure  3.1.  Avo  might  be  read  “The  vault 
was  closed  from  0000  hours  to  0900  hours,  open  from  0900  hours  to  2100  hours,  and  closed 
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from  2100  hours  to  2400  hours.”  The  use  of  past  tense  here  is  arbitrary,  is  closed  or  will 
be  closed  would  be  equally  appropriate.  Aggregate  assignments  are  denoted  by  uppercase 
letters  from  the  beginning  of  the  alphabet,  e.g.,  A  or  5,  subscripted  if  necessary  by  the 
symbol  for  the  associated  temporal  aggregate. 

Sometimes  the  entire  state  of  a  TA  is  not  known.  For  example,  we  may  only  know 
that  the  vault  was  closed  from  0000  to  0900.  A  partial  aggregate  assignment,  which  is 
simply  a  subset  of  an  aggregate  assignment,  expresses  this. 

Definition  4 .  P  is  a  partial  aggregate  assignment  (PA A)  for  some  temporal  aggregate , 
X ,  iff  there  exists  an  A  such  that  P  C  A  where  A  is  an  aggregate  assignment  for  X .  In 
other  words ,  a  partial  aggregate  assignment  is  a  partial  function  from  T  into  £. 

Our  example,  where  the  vault  is  only  known  id  be  closed  over  one  interval  is  thus 
written: 

pv0  -  {([0000, 0900], false)}  (3.5) 


Note  that  Pyo  is  a  subset  of  aggregate  assignment  Ayo  above.  PAAs  are  usually  denoted 
by  capital  letters  from  the  middle  of  the  alphabet;  however,  since,  by  definition  an  aggregate 
assignment  is  also  a  PAA,  some  uppercase  letters  from  the  beginning  of  the  alphabet  may 
sometimes  be  used  for  PAAs. 


Line-Open  = 

{([0900, 0910],  lol), 
([0905,091 5],  lo2), 
([091 0,0920]  ,lo3)> 


P(lolhLO)  -  1/3 
P(lo2hLO)  -  1/2 
P(lo3hLO)  =  1 


Figure  3.3  A  simple,  one-process  probabilistic  temporal  network  enforcing  a  mutual  ex¬ 
clusion  relationship.  A  communication  line  can  only  be  opened  given  that  it 
has  not  previously  been  opened. 


3.1.2  Temporal  Causal  Relationships.  How  are  the  aggregates  interconnected? 
The  example  network  in  Figure  3.3  shows  a  directed  edge  from  ‘Line-Open’  to  itself  labeled 
({m,  o},  OR).  The  edge  combined  with  the  conditional  probability  tables  enforce  a  mutual 
exclusion  constraint  on  ‘Line-Open,’  i.e.,  the  communication  line  can  only  be  opened  over 
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one  of  the  three  possible  intervals.  Mutual  exclusion  in  time  is  an  important  characteristic 
of  many  processes  which  can  occur  over  only  one  of  several  different  intervals.  Edges  in 
the  probabilistic  temporal  network  are  temporal  causal  relationships  or  TCRs. 


Figure  3.4  The  probabilistic  temporal  network  from  Figure  3.3  decomposed  to  explicitly 
show  the  intervals  (small  circles)  and  the  temporal  relationships  between 
intervals  (dotted  lines). 


Figure  3.5  The  network  in  Figure  3.4  with  the  temporal  causal  relationship  replaced 
with  the  TCR  induced  random  variables.  The  induced  random  variables  are 
labeled  with  the  name  of  the  corresponding  RV-schema,  in  this  case,  OR. 

While  portrayed  graphically  as  a  labeled  edge  between  temporal  aggregates,  the 
TCR  is  actually  shorthand  for  a  set  of  induced  random  variables  that  enforce  the  temporal 
constraints.  These  random  variables  combine  the  intervals  selected  by  a  disjunctive  set  of 
interval  relations  (see  Table  2.1),  e.g.,  {m,  o},  using  the  probability  distribution  specified 
by  a  schema,  e.g.,  OR,  XOR,  PASSTHROUGH.  Figure  3.4  shows  the  example  network 
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from  Figure  3.3  with  the  intervals  and  temporal  relations  explicitly  shown.  For  example, 
the  dotted  line  from  interval  lo\  to  interval  I02  shows  that  lo\  overlaps  Z02. 

Figure  3.5  shows  the  network  with  the  TCR  replaced  by  the  appropriate  induced  RVs. 
The  figure  shows  that  the  probability  of  the  line  being  open  over  [0910,  0920]  is  dependent 
on  the  probability  of  the  line  being  open  over  [0900,0910]  and  [0905,0915],  mediated  by 
the  induced  OR  random  variable.  Likewise,  the  probability  of  the  line  being  open  over 
interval  [0905,0915]  is  dependent  on  the  line  being  open  over  [0900,0910]  again  mediated 
by  OR.  The  probability  of  the  line  being  open  over  [0900, 0910]  is  independent  of  the  other 
probabilities. 

What  are  the  semantics  behind  the  temporal  causal  relationship?  The  probability 
of  some  TA  Y  taking  on  some  particular  state  over  evach  interval  is  dependent  on  TA  X 
taking  on  some  state  over  interval(s)  fitting  the  temporal  relation,  e.g.,  “no  interval  in  Y 
can  have  state  true  unless  that  interval  follows  some  interval  in  X  having  state  true.”  This 
is  written  X({<},OR)Y  with  every  (i,r)  E  T(Y)  having  conditional  probabilities  of  the 
form  P(r\ . . . ,  -iX)  =  0.0.  Schemas  in  general  and  the  OR  schema  in  particular  are  further 
discussed  below. 

Definition  5.  A  temporal  causal  relationship  (TCR)  describes  a  relationship  between  two 
temporal  aggregates  X  =  (Tx,£x)  and  Y  =  (Ty,£y)  where  X  is  considered  the  “cause” 
and  Y  the  “ effect  ”  Textually,  the  TCR  is  written  X(7l,M)Y  where  7Z  is  a  nonempty  set 
of  interval  relations  and  M  is  a  schema  for  describing  random  variables .  Graphically,  the 
TCR  is  presented  as  a  directed  edge  from  the  node  for  X  to  the  node  for  Y,  labeled  with 
(71,  M).  Formally,  the  relationship  is  written  as  the  four-tuple  (7 Z,Xi,X,Y). 

The  TCR  induces,  for  each  interval-RV pair,  (iy,ry)  in  Ty,  a  random  variable  A4r, 
defined  over  Yx,  such  that 

1 .  ry  is  directly  dependent  on  A4r. 

2 .  for  each  (ixirx)  €  T^  where  ixFJiy,  Mr  is  directly  dependent  on  rx- 

3 .  for  each  random  variable  x  such  that  Mr  is  directly  dependent  x,  there  exists  an  ix 

such  that  (ix<>%)  E  Tj^. 

4-  the  conditional  probability  table  for  Mr  is  defined  by  the  schema  M. 
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Temporal  causal  relationships  are  rarely  given  explicit  names.  Notationally,  the 
random  variables  in  the  interval-RV  pairs  in  the  effect  TA  are  usually  written,  in  the 
conditional  probability  tables,  as  being  dependent  simply  on  the  cause  TA.  This  can  be 
seen  in  the  tables  for  the  ‘Vault-Open’  temporal  aggregate  in  Figure  3.2.  In  cases  where 
there  is  more  than  one  TCR  between  two  TAs,  some  appropriate  name  or  symbol  can  be 
associated  with  the  TCR  and  the  dependencies  in  the  effect  TA  can  be  written  as  the  name 
of  the  cause  TA  subscripted  with  the  name  of  the  TCR. 

The  random  variable  schema  algorithmically  defines  the  conditional  probability  tables 
for  the  random  variables  induced  by  the  temporal  causal  relationship. 

Definition  6.  A  random  variable  schema  M  takes  as  parameters  a  set  of  states  E,  a  set 
of  interval-RV  pairs  T  with  RVs  defined  over  E,  a  single  interval-RV  pair  ( i,r ),  and  an 
algorithm  A  which  together  define  the  conditional  probability  table  for  a  random  variable 
Mr  with  states  E  such  that  for  each  (ir,^)  6  T,  Mr  is  directly  dependent  on  rT.  Mr  is 
directly  dependent  on  nothing  else.  The  conditional  probability  table  for  Mr  is  constructed 
with  an  algorithm,  A.  A  can  be  either  declarative  or  procedural. 


For  many  models,  these  schemas  are  extremely  simple,  e.g., 

t  T,  N 

E  =  {true,  false}, 

V  Aor  j 


OR: 


ORr 


where  Aqr  is  defined  as 


(3.6) 


Algorithm  Is  (Aor) 

1.  Let  (*Ti  j  ^Ti )  •  •  •  (*Tn  >  rTn )  &e  an  arbitrary  ordering  of  the  elements  of  T 

2.  Create  random  variable  ORr  such  that  for  each  assignment  A  to  {tt,  , . . . ,  rjn  } 

(a)  If  there  exists  an  r  G  A  such  that  r  =  true 

P(ORr  =  true|A)  =  1 

Let 

P(ORr  =  false|A)  =  0 

(b)  else 

P(ORr  =  true  |  A)  =  0 
Let 

P(ORr  =  false|A)  =  1 
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Exclusive-or,  XOR,  can  be  defined  by  changing  “there  exists  an  r  €  A ”  in  step  2a 
above  to  “there  exists  a  unique  r  6  A.”  The  other  logical  operations  are  also  easily  defined. 


The  schema  PASSTHROUGH,  defined: 

(  T  =  (ix,rT),  \ 

(*>r)> 

\  ApASSTHROUGH  J 
with  ApASSTHROUGH  defined  as 


PASSTHROUGH 


PASSTHROUGH,. 


(3.7) 


Algorithm  2:  (Apassthrough) 

1.  Create  random  variable  PASSTHROUGH,,  such  that  for  each  a  £  £ 
P(PASSTHROUGHr  =  <r|rT  =  cr)  =  1 
P(PASSTHROUGHr  ^  a\rT  =  a)  =  0 

produces  a  random  variable  for  a  causal  relationship  from  a  singleton  TA  (only  one  interval- 
RV  pair  in  T).  The  temporal  causal  relationship 


X(A,  PASSTHROUGH)^, 


(3.8) 


read  “ X  exerts  direct  causal  influence  on  Y  under  all  temporal  relationships”  is  analogous 
to  the  causal  relation  in  Bayesian  networks.  This  type  of  relationship  is  useful  when 
‘temporalizing’  existing  Bayesian  networks. 


3.1.3  Probabilistic  Temporal  Networks.  A  probabilistic  temporal  network  is  a 
directed  graph  in  which  the  nodes  are  TAs  and  the  edges  are  temporal  causal  relationships. 
Definition  7.  A  probabilistic  temporal  network  (PTN)  is  an  ordered  pair  ( R,E )  where 
R  is  a  set  of  temporal  aggregates  and  E  is  set  of  temporal  causal  relationships  such  that, 
for  each  TCR  in  E  from  some  temporal  aggregate,  X,  to  some  temporal  aggregate,  Y,  both 
X  and  Y  are  in  R. 

If  each  temporal  aggregate  in  a  probabilistic  temporal  network  is  assigned,  then  that 
PTN  is  said  to  be  completely  assigned.  The  set  of  all  of  the  assignments  and  associated 
temporal  aggregates  forms  a  complete  assignment. 
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Definition  8.  The  set  Y>  containing  (temporal  aggregate,  aggregate  assignment)  pairs  is 
a  complete  assignment  (CA)  of  some  PTN  ( R,E )  iff 

1.  V(X,  A)  G  ,  X  €  R  and  A  is  an  aggregate  assignment  of  X. 

2.  V(X,  A),  (Y,B)  etf,  X  =  Y  =>  A  =  B. 

3.  VX  €  BB(Y,A)  €  ^  such  that  X  =  Y. 

Complete  assignments  are  denoted  by  uppercase  script  letters  from  the  beginning  of 
the  alphabet,  e.g.,  sf  ,38,  or 

When  inferencing  over  a  probabilistic  temporal  network,  incomplete  evidence  as  to 
the  state  of  the  network  may  be  held.  Such  evidence  is  represented  with  a  partial  assign¬ 
ment.  In  the  simplest  form,  any  subset  of  a  complete  assignment  is  a  partial  assignment. 

\ 

A  more  complicated  case  arises  when  only  a  partial  aggregate  assignment  is  known  for 
some  temporal  aggregate.  Since  a  PAA  is  a  subset  (possibly  improper)  of  an  aggregate 
assignment,  a  partial  assignment  to  a  PTN  consists  of  a  subset  of  the  variables  of  the  PTN 
and  associated  partial  aggregate  assignments  for  the  TAs.  More  formally: 

Definition  9.  The  set  &  containing  (temporal  aggregate ,  aggregate  assignment)  pairs  is 
a  partial  assignment  (PA)  of  some  PTN  (R,E)  iff 

1.  V(X,  P)  G  &,  X  G  R  and  P  is  a  partial  aggregate  assignment  of  X. 

2.  V(X,P),(Y,Q)  G  X  =  Y  =>  P  =  Q. 

PAs  are  usually  denoted  with  uppercase  script  letters  from  the  middle  of  the  alphabet, 
e.g.,  &  or  As  a  complete  assignment  is  a  subset  of  itself,  by  definition  any  complete 
assignment  is  also  a  partial  assignment. 

Notation.  A  partial  assignment,  2? ,  is  said  to  be  a  subset  of  another  partial  assignment, 
(denoted  C  £2)  if  every  (X,  P)  in  &  (except  those  having  P  =  $)  has  a  corresponding 
(Y,  Q )  in  such  that  X  =  Y  and  P  C  Q.  A  complete  assignment,  say  ,  is  said  to  be 
compatible  with  a  partial  assignment,  & ,  if  £?  otherwise  *€  is  said  to  be  incompatible 
with  & .  If  is  incompatible  with  ,  then  at  least  one  temporal  aggregate  in  ^  has  a 
different  assignment  than  that  in 

The  goal  of  belief  revision  is  to  find  the  most  probable  state  of  the  world  given  some 
evidence.  This  is  the  most  probable  explanation. 
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Definition  10.  Let  B  be  a  PTN,  let  &  be  partial  assignment  (evidence)  of  B,  and  let 
c£  be  some  complete  assignment  (explanation)  of  B.  if  is  a  most  probable  explanation 
(MPE)  given  &  iff  for  all  si  where  each  si  is  a  complete  assignment  of  B  compatible  with 
9,  P(tf \»)  >  P(s/\&>). 

Since  P(si \&)  =  P(si,  & )/P( £P)  and  an  incompatible  complete  assignment  can  not 
be  a  MPE  (unless  the  evidence  is  itself  contradictory  in  which  case  all  CAs  are  MPEs), 
we  only  need  to  consider  as  candidates  those  complete  assignments  for  which  &  C  A. 
Thus  since  &  C  si,  we  derive  P(si\ &)  -  P{si)/P( &).  Furthermore,  since  1/P(&)  is 
a  factor  in  the  conditional  probability  of  each  explanation  si,  to  find  the  MPE,  we  need 
only  compute  the  probability  of  each  complete  assignment,  i.e.,  P(si).  P(si )  is  calculated 
with  the  chain  rule.  v 

3.2  Cycles  and  Temporal  Ordering 

Now  that  the  basic  definitions  and  properties  have  been  introduced,  this  section 
briefly  explores  the  probabilistic  temporal  network  in  Figure  3.3  and  considers  a  potential 
alternate  representation.  Figure  3.3  shows  a  network  using  a  cyclic  dependency  to  represent 
the  internal  dependencies  in  process  ‘Line-Open,’  i.e.,  a  cyclic  TCR  has  been  used  to 
explicitly  model  the  endogenous  temporal  relationships.  For  ‘Line-Open’  to  be  true  over 
some  interval,  ‘Line-Open’  must  not  be  true  over  any  earlier  intervals. 

Examining  the  intervals,  “earlier”  turns  out  to  be  either  meets  or  overlaps.  This 
is  represented  with  a  disjunctive  set  containing  meets  and  overlaps:  {m,o).  The  condi¬ 
tional  dependencies  are  represented  using  the  OR  schema.  The  TCR,  LO({m,  o },  OR)LO, 
describes  the  random  variable  OR/03  such  that 

P(OR;03|->Zoi, -1/02)  =  0  an<i  .P(-iORj03|-'/oi, ~>Zo2)  =  1.  (3-9) 

OR/o3  replaces  LO  in  P(los\->LO)  =  1  to  yield  P(lo^\->OKi03)  =  1.  By  using  cyclic  TCRs 
to  explicitly  represent  the  temporal  relationships  within  a  process,  the  knowledge  engineer 
can  more  clearly  “see”  the  nature  of  the  system  being  modeled. 

Figure  3.6  shows  an  attempt  to  simplify  the  conditional  dependencies  in  process 
‘Line-Open.’  The  conditional  probability  tables  for  each  random  variable  in  process  LO 
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Line-Open  = 

{([0900, 0910], lol), 
([0905, 0915], lo2), 
«0910,0920],lo3)> 


P(lolhLO) «  1 
P(lo2hLO)  -  1 
P(lo3hLO)  =  1 


Figure  3.6  The  network  in  Figure  3.3  rewritten  using  a  cyclic  dependency  such  that 
the  conditional  probability  table  for  each  RV  can  be  written  with  the  same 
probability  1  instead  of  the  dependent  probabilities  1/3,  1/2,  and  1  (not 
well- formed). 


are  identical.  This  simplification  is  accomplished  using  the  TCR  LO(A  -  {=},OR)L02, 
which  states  that  the  random  variable  in  each  interval-RV  pair  is  dependent  on  the  ran¬ 
dom  variables  in  all  the  other  interval-RV  pairs.  While  visually  similar  to  the  network  in 
Figure  3.3,  this  network  has  a  serious  problem. 


Figure  3.7  Process  ‘Line-Open’  from  Figure  3.6  drawn  with  the  temporal  causal  rela¬ 
tionship  expanded.  The  loop  shows  a  cycle  in  the  dependencies. 

The  problem  is  exposed  in  Figure  3.7  which  shows  process  ‘Line-Open’  with  the  TCR 
expanded  into  the  induced  random  variables.  Notice  that  this  expanded  structure  reveals 
violations  of  the  conditional  independence  assumptions  discussed  in  the  presentation  of 
Bayesian  networks.  Random  variable  I02  is  dependent  on  OR^2  which  is  dependent  on 
lo\  which  is  dependent  on  OR/0l  which  is  dependent  on  I02  which  is  ... .  I02  is  separated 

2 The  set,  A  —  {=},  consists  of  all  thirteen  interval  relations  sans  equals 
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from  itself  by  random  variables  OR *02,  Zoi,  and  OR/0l  indicating  that  given  knowledge  of 
each  of  these  variables  that  lo 2  is  independent  of  itself  which  is  clearly  contradictory. 

Figure  3.3  demonstrates  an  example  in  which  a  cycle  in  the  PTN  provided  a  useful 
representation  of  the  internal  dependencies  within  a  process.  Figure  3.6,  on  the  other  hand, 
shows  a  case  in  which  the  cycle,  while  intuitively  satisfying,  violates  the  requirements  of 
conditional  independence.  This  raises  the  question:  “Under  what  circumstances  are  cycles 
appropriate  in  probabilistic  temporal  networks?” 

Definition  11.  An  expanded  probabilistic  temporal  network  (EPTN)  is  the  directed  graph 
created  by  expanding  all  temporal  causal  relationships  in  some  PTN . 


Figure  3.8  Expanded  Probabilistic  Temporal  Network  for  PTN  in  Figure  3.2.  Labels  on 
arcs  indicate  temporal  relation;  during  inverse,  starts,  meets. 

Figure  3.8  shows  the  expanded  probabilistic  temporal  network  for  the  PTN  from 
Figure  3.2.  The  OR  node  for  o\  is  not  shown  as  it  has  no  parents  and  does  not  affect 
the  probability  distribution,  i.e.,  P(OR0l  =  false)  =  1.0.  Note  that  a  given  EPTN  is  not 
necessarily  a  Bayesian  network.  Cycles  can  exist  or  extraneous  arcs  can  be  present,  i.e., 
not  a  minimal  /-map.  Redundant  induced  RVs  may  also  be  present.  Figure  3.9  presents 
an  optimized  network  with  an  equivalent  joint  distribution  as  that  of  Figure  3.8.  This 
optimization  process  is  an  avenue  of  further  research. 
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Figure  3.9  Optimized  network  for  EPTN  in  Figure  3.8. 


Definition  12.  A  probabilistic  temporal  network  is^said  to  be  well-formed  iff  the  cor¬ 
responding  expanded  probabilistic  temporal  network  contains  no  directed  cycles ,  i.e .,  the 
EPTN  of  a  well-formed  PTN  is  a  directed  acyclic  graph. 

Figure  3.7,  shown  previously,  gives  an  example  EPTN  with  cycles.  As  discussed, 
cycles  in  the  expanded  structure  are  problematic.  A  well-formed  probabilistic  temporal 
network  does  not  contain  any  such  directed  cycles. 

Lemma  1  (  [22]).  For  any  DAG  D  there  exists  a  probability  distribution  P  such  that 
D  is  a  perfect  map  of  P  relative  to  d-separation ,  i.e.,  P  embodies  all  the  independencies 
portrayed  in  D,  and  no  others. 

This  lemma,  combined  with  Definition  12,  leads  directly  to 
Theorem  1.  For  each  well-formed,  finite  PTN  (R,E)  there  exists  a  probability  distribu¬ 
tion  P  such  that  P  embodies  all  the  independencies  in  (R,E),  and  no  others. 

Theorem  1  indicates  that  if  we  have  a  well-formed,  finite  PTN,  then  we  have  an 
associated  probability  distribution.  How  can  we  guarantee  that  a  given  PTN  is  well- 
formed  and  finite?  If  there  are  a  finite  number  of  temporal  aggregates  in  the  PTN  and 
each  aggregate  contains  only  a  finite  number  of  interval- RV  pairs,  then  the  PTN  is  finite.  As 
mentioned  earlier,  finite  PTNs  are  assumed.  Clearly  if  the  PTN  structure  itself  contains  no 
cycles  then  there  can  be  no  cycles  in  the  EPTN  and  our  PTN  is  well-formed.  The  problem 
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with  this  restriction  is  that  we  lose  significant  expressive  power.  Networks  such  as  that  in 
Figure  3.2  would  not  be  allowed. 

Cycles  in  the  EPTN  occur  when  an  interval-RV  pair  becomes  self-dependent.  If 
only  temporal  relations  which  are  strictly  one  directional  are  used,  an  interval-RV  pair 
can  not  possibly  be  self-dependent.  For  example,  if  only  {<}  is  used  in  a  PTN,  no  cycles 
are  possible.  Santos,  in  the  development  of  the  temporal  abduction  problem,  defined  the 
concept  of  monotonicity  [25]  as  applied  to  temporal  relations. 

Definition  13.  A  set  7 Z  of  temporal  relations  is  said  to  be  monotonic  if  and  only  if  for 
all  R  in  TZ,  Rc  fl  (i?0)”1  =  0  where  R  =  iR  and  R°  is  the  transitive  closure  of  R  and 
R  is  the  inverse  of  the  transitive  closure  of  R. 

In  the  same  work  [25],  Santos  introduced  the  following  monotonic  set: 

Proposition  1.  The  subset  of  relations  C  =  {<,o,  from  the  original  thirteen 

is  a  monotonic  set. 

Intuitively,  a  monotonic  set,  such  as  C  above,  can  be  said  to  temporally  ‘point  in 
only  one  direction.’  This  is  compatible  with  Suppes’  probabilistic  theory  of  causality  [31] 
and  Shoham’s  criteria  for  causation  [28]  (both  point  based  approaches)  in  which  causation 
can  only  extend  forward  in  time.  For  this  reason,  C  is  said  to  be  the  causal  set  of  temporal 
relations.  The  network  in  Figure  3.3  holds  to  C. 

Theorem  2.  If \  for  probabilistic  temporal  network  (R,  E),  there  exists  a  monotonic  set , 
Q ,  of  temporal  relations  such  that  for  each  (72.,  Y)  e  R,  72.  C  Q;  then  the  PTN 

{R,E)  is  well-formed. 

Proof.  Since  the  only  temporal  relations  used  in  the  PTN  are  drawn  from  Q  and  Q  is 
monotonic ,  no  interval-RV  pair  can  ever  relate  to  itself  temporally  (otherwise  Qcn(Qc)-1  7^ 
0)  and  as  there  can  be  no  cycles  within  the  TAs  themselves,  there  can  be  no  cycles  in  the 
EPTN;  thus  the  PTN  is  well- formed.  □ 

Combining  Theorem  2  and  the  causal  set  C  from  Proposition  1  leads  us  to  the 
following  definition: 

Definition  14.  A  causal  probabilistic  temporal  network  (CPTN)  is  a  PTN  for  which  The¬ 
orem  2  holds  with  1Z  =  C. 
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The  causal  PTN  model  enforces  the  constraint  that  causality  flows  forward  in  time. 
Each  link  in  the  network  advances  in  time.  When  following  a  cycle  from  a  temporal 
aggregate  back  to  itself,  one  always  returns  to  a  different  interval-RV  pair.  The  CPTN 
model  enforces,  through  local  constraints,  a  consistent  theory  of  time. 
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Figure  3.10  PTN  modeling  two  people  chatting  with  an  an  occasional  conversational 
trigger.  Note  the  use  of  set-builder  notation. 


The  equals  relation,  ‘=,’  is  not  a  member  of  C,  and  cannot  be  a  member  of  any 
monotonic  set  of  relations  as  ‘=’  is  its  own  inverse.  Equals  is,  however,  useful  for  expressing 
simultaneity.  Figure  3.10  shows  an  example  in  which  two  people  are  chatting.  Talker  A 
tends  to  ‘talk  over5  Talker  B.  To  model  this  example,  the  TCR  from  B  to  A  includes 
equals  as  well  as  meets.  Figure  3.11  shows  the  EPTN  for  Figure  3.10. 

To  insure  that  a  CPTN  extended  to  use  equals  is  well-formed,  each  directed  cycle 
must  have  at  least  one  TCR  in  which  equals  is  not  used.  This  guarantees  ‘temporal 
progression’  in  each  cycle.  A  probabilistic  temporal  network  limited  to  C  U  {=}  with  this 
broken  cycle  property  is  said  to  be  S-Causal  (SCPTN)  (‘S’  for  simultaneity). 


3.3  A  Related  Model 

In  addition  to  the  other  temporal  representations  mentioned  in  Chapter  II  of  this 
thesis,  Aliferis  and  Cooper  [1]  have  developed,  in  parallel  with  the  work  presented  in 
this  thesis,  a  preliminary  temporally  extended  Bayesian  network  formulation  termed  the 
Modifiable  Temporal  Bayesian  Network- Single  Granularity  (MTBN-SG).  Their  research, 
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Figure  3.11  EPTN  for  PTN  in  Figure  3.10  with  n  =  4. 


published  shortly  after  the  author’s  initial  results  [33],  is  discussed  in  some  detail  here  as 
they  also  introduce  the  idea  of  representing  state  over  several  points  in  time  as  a  single 
node  in  a  network  structure  with  arcs  between  nodes  representing  temporally  qualified 
causation. 

A  MTBN-SG  is  primarily  an  extended  time-sliced  Bayesian  network  defined  over  a 
range  of  time  points.  Each  ordinary  node  in  a  MTBN-SG  is  indexed  over  this  entire  range. 
Edges  between  nodes  are  represented  by  mechanism  variables.  A  mechanism  variable  is 
a  Boolean  true/false  random  variable  indicating  whether  the  link  is  active,  i.e.,  whether  a 
dependency  exists  between  the  connected  variables.  Each  such  mechanism  has  an  associ¬ 
ated  lag  random  variable  (Delta  TAs  in  the  PTN)  defined  over  the  range  of  time  points 
indicating  the  delay  between  the  “cause”  and  the  “effect.”  Atemporal  or  abstract  random 
variable  nodes  are  supported  and  are  not  instantiated  for  each  time  point.  The  resultant 
graph  can  have  cycles  to  allow  expressions  of  recurrence  and  feedback.  As  long  as  all 
cycles  in  the  underlying  joint  distribution  have  zero  probability,  the  graph  is  said  to  be 
well-defined. 

Since  the  edges,  both  mechanism  and  lag  components,  are  represented  by  random 
variables,  the  edges  can  be  both  dependent  on  and  causal  too  other  random  variables  in  the 
network.  This  representation  allows  the  knowledge  engineer  to  express  conditions  where  a 
relationship  exists  between  variables  only  under  certain  circumstances.  The  problem  with 
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this  approach  is  that  joint  distributions  can  be  described  which  are  not  compatible  with 
the  Bayesian  model.  Maintaining  consistency  in  the  local  probability  tables  across  random 
variables  then  becomes  a  concern. 

As  indicated  in  the  name,  the  MTBN-SG  model  only  supports  a  single  granularity  for 
the  size  of  the  time  step  in  any  given  network.  Extending  the  model  to  support  multiple 
granularities  appears  problematic,  especially  in  the  case  when  the  granularities  are  not 
multiples,  e.g.,  g\  is  every  10  minutes  and  <72  is  every  15  minutes.  A  perhaps  more  difficult 
problem  arises  in  the  model  if  the  start  time  for  one  granularity  is  not  the  same  as  that 
for  another  as  the  granularities  may  be  forever  out  of  phase.  This  problem  is  not  an  issue 
for  our  model.  Individual  processes  or  temporal  aggregates  can  be  modeled  with  arbitrary 
sets  of  intervals.  There  is  no  requirement  that  the  intervals  in  one  TA  match  those  in  other 
TAs  as  the  temporal  causal  relationship  describes  the  desired  relationships. 

Intervals  can  be  modeled  in  the  MTBN  with  abstract  variables,  I  NT. ST  ART  and 
I  NT -END ,  representing  the  start  and  end  points  of  the  interval  respectively.  I  NT -END 
is  dependent  on  INT -START  such  that  the  end  time  will  never  be  before  the  start  time. 
The  duration  of  an  interval  can  be  acquired  from  a  third  variable,  I  NT  JDU R,  dependent 
on  both  INT-START  and  INTJEND.  One  problem  with  this  representation  arises  from 
the  need  to  use  abstract  instead  of  time  indexed  variables.  If  one  needs  to  reason  with  both 
a  blend  of  time-sliced  and  interval  data,  then  dependencies  will  exist  between  the  abstract 
variables  and  the  time-indexed  ones. 

The  semantics  of  such  arcs  and  the  deployment  transformations  (conversion  to  BN 
form)  are  not  clear.  Presumably,  if,  in  the  MTBN,  an  abstract  variable  was  dependent 
on  a  time  indexed  variable,  then,  in  the  deployed  graph,  the  abstract  variable  would  be 
dependent  on  each  copy  of  the  time  indexed  variable  for  each  time  index.  If  the  time 
indexed  variable  is  dependent  on  the  abstract  variable,  then  the  condition  is  similar  in 
that  each  copy  of  the  time  indexed  variable  is  dependent  on  the  abstract  variable.  These 
dependencies  result  in  high  degrees  of  fan-in  and  fan-out  in  the  deployed  graph  leading  to 
excessive  number  of  needed  probabilities  and  high  complexity. 
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The  MTBN-SG  formulation,  introduced  by  Aliferis  and  Cooper,  is  interesting  in  its 
high-level  similarity  to  the  probabilistic  temporal  network.  Their  point  based  approach,  se¬ 
mantic  difficulties  arising  from  the  abstract  variables,  and  the  single  granularity  restriction 
are  problems  which  the  PTN  does  not  have. 
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IV.  Reasoning 

This  chapter  develops  an  approach  for  reasoning  over  probabilistic  temporal  networks. 
The  first  section  introduces  the  method  of  calculating  the  probability  of  some  state  of,  or 
explanation  for,  the  system  being  modeled.  Section  4.2  extends  these  calculations  to  find 
the  most  probable  state  of  the  system.  Unfortunately,  finding  the  most  probable  state 
is  A/’P-hard.  Section  4.3  presents  a  subclass  of  the  probabilistic  temporal  network  with 
polynomial  time  solvability. 

4-1  Constructing  a  Partial  Order  and  Using  the  Chain  Rule 

In  Section  3.1,  we  discussed  finding  the  most  probable  explanation.  The  most  probable 
explanation  is  the  complete  assignment  with  the  greatest  joint  probability.  As  mentioned, 
this  joint  probability  is  calculated  using  the  chain  rule.  To  efficiently  use  the  chain  rule,  a 
partial  ordering  (from  effect  to  cause)  of  the  random  variables  must  exist.  The  ordering  is 
drawn  from  the  expanded  PTN  and  can  only  be  found  when  the  PTN  is  well-formed  and 
finite.  The  following  algorithm  finds  a  partial  ordering  for  a  well- formed  and  finite  PTN: 

Algorithm  3:  (Partial  Ordering) 

1.  First,  find  the  EPTN  of  a  well-formed  and  finite  PTN. 

2.  From  the  EPTN,  select  all  RVs  with  no  children.  Place  these  first  in  the  ordering  in 
arbitrary  order. 

3.  Find  all  RVs  among  all  those  not  yet  ordered  such  that  all  children  thereof  are  ordered. 
Place  these  next  in  the  ordering,  again  in  arbitrary  order. 

4-  Repeat  Step  3  until  no  unordered  RVs  remain. 

For  example,  the  PTN  in  Figure  3.3  expands  to  the  EPTN  in  Figure  3.5.  A  partial 
ordering  of  the  RVs  is  found  in  the  following  steps: 

1.  Order:  ()  RVs:  {lo\, lo2,  lo3,  ORj02,  OR<03} 

2.  Order:  (Z03)  RVs:  {loi,  I02,  OR;02,  OR/03} 

3.  Order:  (Z03,  OR;G3)  RVs:  {Z01, 102,  ORj02} 

4.  Order:  (Zo3, OR;03, Zo2)  RVs:  {Zo1,OR/02} 

5.  Order:  (Zo3,  OR;03,Zo2,ORj02)  RVs:  {7oi} 
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6.  Order:  (Z03,  OR^3,  I02,  OR/02,  loi)  RVs:  {} 


yielding  (Z03,  OR/03,  /02,  OR;02,  /01)  as  a  partial  ordering. 

Since  a  partial  ordering  exists  for  the  network,  the  chain  rule  can  be  used  to  find  the 
joint  probability  of  each  assignment.  Table  4.1  shows  the  probability  distribution  defined 
by  the  example  in  Figure  3.3.  Only  non-zero  probability  assignments  are  shown  (but  one). 


Table  4.1  The  possible  complete  assignments  to  the  network  in  Figure  3.3  with  associ¬ 
ated  probabilities.  One  ‘impossible’  assignment  is  also  shown. 


Joint  Probability  Table  for  Figure  3.3 

Line-Open 

Assignment 

[0910,0920] 

OR/o3 

[0905,0915] 

OR/02 

[0900,0910] 

Probability: 

true 

false 

false 

false 

false 

1 

1 

1/2 

1 

2/3 

1/3  (1) 

false 

true 

true 

false 

false 

1 

1 

1/2 

1 

2/3 

1/3  (2) 

false 

true 

false 

true 

true 

i 

i 

1 

i 

1/3 

1/3  (3) 

true 

true 

false 

true 

true 

0 

1 

1 

1 

1/3 

0  (4) 

Total: 

1 

Each  joint  probability  in  Table  4.1  is  calculated  using  the  chain  rule  [22].  For  example, 
the  probability  of  the  complete  assignment 


is  calculated  from 


( 


loA 


\ 


(([0900, 0910],  loi ),  true), 
(([0905, 0915],  Z02),  false), 
(([0910, 0920],  Zo3),  false) 


\ 

> 

.  ) 

4 

(4.1) 


■  P(lo3  =  false]OR;03  =  true)  ^ 

1  ) 

-P(ORj03  =  true|Zoi  =  true,  I02  =  false) 

1 

P(lo2  =  false|ORj02  =  true) 

= 

1 

,P(ORj02  =  true|Zoi  =  true) 

1 

^  P(loi  =  true)  J 

V  5  J 
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4.2  Constraint  Satisfaction 

The  previous  section  showed  how  to  calculate  the  probability  of  a  complete  assign¬ 
ment  to  a  probabilistic  temporal  network.  This  section  presents  a  method  for  finding 
the  most  probable  complete  assignment,  i.e.,  performing  belief  revision  on  probabilistic 
temporal  networks.  A  constraint  satisfaction  approach  is  used  with  mixed  Boolean  linear 
programming.  Constraint  satisfaction  has  three  main  advantages;  first,  constraints  can 
be  formed  to  take  advantage  of  the  inherent  structure  of  the  PTN;  second,  very  efficient 
algorithms  developed  by  the  operations  research  community  are  available;  and  finally,  al¬ 
ternate  explanations,  e.g.,  second  or  third  best,  can  be  found  using  techniques  presented 
in  [24]. 

Definition  15.  A  constraint  system  is  a  3-tuple  (IT ,/,  ^)  where  T  is  a  finite  set  of  vari¬ 
ables,  I  is  a  finite  set  of  linear  inequalities  based  on  T,  and  ^  is  a  cost  function  from 
T  x  {true, false}  to  3?. 

Our  probabilistic  temporal  network  model  can  be  considered  to  have  a  layered  struc¬ 
ture.  The  layers  consist  of  temporal  aggregates  and  temporal  causal  relationships.  For  this 
reason,  the  system  of  constraints  is  presented  in  two  parts,  those  for  TCRs  and  those  for 
TAs.  For  some  well-formed  PTN  P  =  (R,  E),  the  following  steps  produce  the  constraints, 
variables,  and  costs  for  the  temporal  causal  relationships  in  E  and  those  for  the  temporal 
aggregates  in  R,  i.e.,  the  following  steps  produce  L(P)  —  (r,/,  ip). 

1.  For  each  TCR  (7 Z,  M,  (Tx,  Ex)5  (Ty,  Ey))  in  E , 

(a)  For  each  (iy ,  ry )  E  Ty  construct  variables  . . .  A Ar/X  in  T  where  axv  • *  0*xn 

are  states  in  Ex*  Set  costs  for  each  variable  as 

^(Mr/Xi ,  false)  =  ,  true)  =  0.  (4.3) 

where  1  <  i  <  n  and  add  the  following  constraint  to  I: 

E^7Xi=l.  (4.4) 

1  =  1 

(b)  For  each  (iy,ry)  €  Ty  and  each  ax  €  Y>x  let  (iXl,rXl) . . .  (iXj,rXj)  €  Tx  be 
those  pairs  for  which  iXh'R-W  with  1  <  h  <  j,  then 
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i.  for  each  conditional  probability  of  the  form 


P(Mrr  =  ax \rxi  =  ffXi  ■  ■  ■  rXj  =  <TXj) 


as  induced  by  schema  A4,  construct  a  variable 


q[Mry  =  a x\rxi  =  °Xx  ■  ■  ■  rX3  =  axj] 
(denoted  q  in  following  steps)  in  T  such  that 


tp(q,  false)  =  0, 


tp(q,  true)  =  -  log  \P  JVirY=  ax 


B.  with  the  following  constraint  in  /: 


TXi  =  a  Xx 


rXj  =  crxj 


q>E^:+MZ 


(c)  Let  rMry  be  the  set  of  all  q  constructed  in  step  (lb)  for  variable  MT/X-  For 
each  such  variable,  add  the  following  constraint  to  /: 


E 

geTMTY 


(4.10) 


2.  For  each  TA  X  =  (Tx?  Ex)  in  R 

(a)  For  each  (ix^x)  £  Tx  construct  variables  in  T  where  &Xi  •  •  *  &xn 

are  states  in  Ex-  Set  costs  for  each  variable  as 


W*. faM  -  true)  =  0. 


(4.11) 


where  1  <  i  <n  and  add  the  following  constraint  to  I: 


E^.  =  1- 


(4.12) 
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(b)  For  each  (ix,rx)  €  T*  and  each  ax  €  Ex  let  M\ . . .  Mj  be  those  random 
variables  induced  by  TCRs  {'TZh ,  Mh .  1'/,  .  Z^)  for  which  1  <  h  <  j  and  Z^  =  X. 
Then 

i.  for  each  conditional  probability  of  the  form 


P(rx  =  ax\M\  =  aYl  . . .  Mj  =  aYj), 


(4.13) 


construct  a  variable 


q[rx  =  ax\Mi  =  aYl  . . .  Mj  =  aYj 


(4.14) 


(denoted  q  in  following  steps)  in  T  such  that 
A. 

false)  =  0, 

(  ( 

ip(q,  true)  =  -  log  I  r  I  rx=  ax 


\ 


M\  =  aYl  y 
Mj  =  av3  )) 


B.  with  the  following  constraint  in  /: 


(4.15) 

(4.16) 


q>EMr4 ,+^x-j 


h=  1 


(4.17) 


(c)  Let  T-j-rx  be  the  set  of  all  q  constructed  in  step  (lb)  for  variable  7J* .  For  each 


such  variable,  add  the  following  constraint  to  I: 


vi  =  E  «• 

rx 


(4.18) 


In  this  construction,  constraints  (4.4)  and  (4.12)  ensures  that  each  random  variable, 
either  induced  or  in  a  TA,  can  take  on  one  and  only  one  value.  Constraints  (4.9)  and 
(4.10)  guarantee  that  each  of  the  probabilities  for  TCR  induced  variables  is  computed  in 
concordance  with  the  appropriate  temporal  relations  and  schema.  Constraints  (4.17)  and 
(4.18)  guarantee  that  the  probability  of  a  temporal  assignment  to  a  TA  is  computed  with 
the  appropriate  set  of  conditional  probabilities.  Variables  of  the  form  q[rx  =  <?x \M\  = 
<jYi  •  •  •  Mj  —  cry.]  are  called  conditional  variables  in  that  they  explicitly  represent  the 
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dependencies  between  RVs  and  are  the  mechanism  for  computing  the  probability  of  any 
complete  assignment. 


For  example,  consider  again  the  simple  probabilistic  temporal  network  in  Figure  3.3. 
Section  4.1  showed  how  to  calculate  the  probability  of  an  assignment  to  this  network  using 
the  chain  rule  (see  Table  4.1  and  Equation  4.2).  Now,  if  we  take  the  complete  assignment 


f 

(([0900, 0910],  lox),  true), 

\ 

> 

LO,  < 

(([0905, 0915],  I02),  false), 

> 

{ 

(([0910, 0920],  Zo3),  false) 

) 

j 

we  expect  our  variable  assignments  to  be 


(4.19) 


LO[°ue  =  q[lo\  =  true|ORj0l  =  false] 

LOl£fse  =  q[lo2  =  false|ORi02  =  true] 

LOf°Jse  =  <lllo3  =  false|OR/03  =  true] 

ORfaise  =  9[OR,0l  =  false] 

OR-t?ue  =  tf[OR;02  =  true|/oi  =  true] 

OR(rue  =  ?[OR(0j  =  true|/oi  =  true,  lo2  =  true] 


with  all  other  variables  being  zero.  Since  the  only  variables  which  encrue  costs  are  the  q[. . .] 
variables,  the  cost  of  the  assignment  is  —  log(l/3)  — log(l)  — log(l)  — log(l)  — log(l)  — log(l)  = 
—  log(l/3)  and  thus  the  probability  of  the  assignment  is  1/3  as  expected.  As  informally 
demonstrated  in  this  example,  the  cost  of  a  variable  assignment  is  found  by  summing  the 
product  of  each  variable  in  T  and  its  corresponding  cost  in  V5- 

Definition  16.  A  variable  assignment  for  a  constraint  system  L  =  (r ,7,-0)  is  a  function 
s  from  r  to  3i.  Furthermore , 


L  If  the  range  of  s  is  {0, 1},  then  s  is  a  0-1  assignment. 

2 .  If  s  satisfies  all  of  the  constraints  in  I ,  then  s  is  a  solution  for  L. 

3 .  If  s  is  a  solution  for  L  and  is  also  a  0-1  assignment ,  then  s  is  a  0-1  solution  for  L . 

Definition  17.  Given  a  constraint  system  L  =  we  construct  a  function  &l  from 

variable  assignments  to  9?  as  follows: 


®l(s)  =  s( 7)^(7,  true)  +  (1  -  s(y))i/>( 7,  false)  (4.21) 

yer 

&i  is  called  the  objective  function  of  L. 
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Definition  18.  An  optimal  0-1  solution  for  a  constraint  system  L  =  (r,  J,  ^)  is  a  0-1 
solution  which  minimizes  ®l- 

By  finding  an  optimal  0-1  solution  for  a  constraint  system,  we  find  the  most  probable 
explanation  for  the  corresponding  PTN.  Santos  [24]  presents  a  customized  algorithm  using 
the  cutting  plane  method  [21]  for  finding  the  optimal  0-1  solution.  Since  any  Bayesian 
network  can  be  represented  as  a  PTN1,  we  know  that,  in  general,  belief  revision  over 
PTNs  is  MV- hard  [8,22]. 

4*3  Polynomial  Time  Belief  Revision — The  Generalized  Temporal  Polytree 

The  previous  section  presented  a  method  for  performing  belief  revision  on  probabilis¬ 
tic  temporal  networks.  In  general,  this  problem  is  MV- liard.  However,  for  singly-connected 
PTNs  (polytrees),  belief  revision  can  be  done  in  polynomial  time.  A  polytree  is  a  directed 
acyclic  graph  in  which  no  more  than  one  path  exists  between  any  two  nodes.  The  lack  of 
undirected  cycles  in  the  graph  structure  allows  for  efficient  local  decisions.  This  section 
presents  the  generalized  temporal  poly  tree  (GTP);  a  PTN  model  with  a  restricted  graph 
and  temporal  structure.  The  EPTN  for  a  GTP  is  guaranteed  to  be  a  polytree. 

First,  a  pair  of  additional  restrictions  on  the  probabilistic  temporal  network  are 
introduced.  These  two  restrictions  force  the  expanded  PTN  to  be  a  causal  tree,  i.e.,  all 
nodes  (except  root  nodes)  have  one  and  only  one  incoming  edge  (cause)2.  A  causal  tree 
structure  allows  for  very  easy  belief  updating  and  revision.  The  first  requirement  is  that 
the  only  interval-interval  relation  allowed  is  meets .  Meets  enforces  a  strictly  monotonic 
progress  in  time  and,  unlike  precedes ,  does  not  allow  “temporally  remote  causation  [31].” 
The  second  requirement  is  that  all  intervals  across  the  network  have  different  end-points. 
Together,  these  two  requirements  impose  a  causal  tree  structure  on  the  expanded  network. 
A  probabilistic  temporal  network  holding  to  these  two  requirements  is  termed  a  Generalized 
Causal  Temporal  Tree. 

1Treat  each  RV  in  the  BN  as  a  TA  with  a  single  interval-RV  pair,  using  the  ({=},  PASSTHROUGH) 
TCR,  and  make  all  intervals  in  the  TAs  equivalent. 

2Note  that  by  this  definition,  the  model  actually  allows  a  collection  of  such  unconnected  trees. 
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Definition  19.  A  generalized  causal  temporal  tree  (GCTT)  is  a  probabilistic  temporal 
network  in  which 

1.  1Z  =  {m}  for  each  (7 Z,M^X,Y)  E  E ,  i.e meets  is  the  only  temporal  relation 
allowed . 

2.  All  intervals  in  all  temporal  aggregates  must  have  unique  end-points . 

Theorem  3.  The  expanded  probabilistic  temporal  network  of  any  generalized  causal  tem¬ 
poral  tree  is  a  causal  tree. 

Proof.  By  Contradiction.  Let  P  =  (i?,  E)  be  some  generalized  causal  temporal  tree.  Let 
N  be  the  EPTN  of  P.  Assuming  that  N  is  not  a  tree,  we  know  by  the  definition  that  there 
exists  a  node,  a,  such  that  at  least  two  different  directed  edges  enter  a  from  two  different 
causal  nodes  (ignoring  intervening  induced  RVs),  say  b  and  c.  Each  of  these  nodes  (a,  6,  c) 
have  associated  intervals,  say,  ([as,ae],  [&s,£>e]?  [c5,ce])  respectively.  Since,  by  the  definition 
of  generalized  causal  temporal  tree,  [6S,  be]meets[as,  ae\  and  [ cs,ce]meets[aSjae ];  be  =  as 
and  ce  =  as  and  thus  be  =  ce.  However,  again  from  the  definition  of  generalized  causal 
temporal  tree,  all  end-points  are  unique  and  thus  be  /  ce.  □ 

Corollary  1.  The  EPTN  of  a  GCTT  in  which  constraint  2  in  Definition  19  is  changed  to 
start-points  instead  of  end-points ,  has  an  inverted  tree  structure. 

By  connecting  together  regions  with  varying  end-points  ( out-regions )  with  regions  of 
varying  start-points  ( in-regions )  a  PTN  with  polytree  structure  is  formed.  A  region,  then, 
is  a  collection  of  TAs  in  which  all  interval  end  or  start  points  are  different.  Regions  join 
together  at  a  set  of  TAs  referred  to  as  a  join-region.  All  TAs  in  a  join-region  are  members 
of  both  regions  being  joined.  For  example,  if  an  in-region  and  an  out-region  are  joined, 
then  all  end-points  in  the  join-region  must  be  different  from  all  end-points  in  the  out-region 
and  all  start-points  in  the  join-region  must  differ  from  all  start-points  in  the  in-region. 
Definition  20.  A  set  of  temporal  aggregates ,  R,  forms  an  out-region  if  for  each 

([*l>ei],n)€  IJ  T  (4.22) 

(T,E  )6R 

there  does  not  exist  another 

(h,e2],r2)€  U  T  (4.23) 

(T,E  )£R 


4-8 


such  that  r\  ^  r 2  and  e\  —  e<i,  i-e .,  all  intervals  in  all  temporal  aggregates  have  unique 
end-points . 

Definition  21.  A  set  of  temporal  aggregates,  R,  forms  an  in-region  if  for  each 

(h,ei],n)€  |J  T  (4.24) 

(T,E)efl 

there  does  not  exist  another 

(h,e2],r2)e  1J  T  (4.25) 

(T,E  )eR 

such  that  r\  r2  and  s\  =  s2)  i.e.,  all  intervals  in  all  temporal  aggregates  have  unique 
start-points. 

Definition  22.  A  set  of  temporal  aggregates,  R,  forms  a  join-region  for  two  in-  or  out- 

\ 

regions,  R\  and  i?2  if  R  =  R\C\  i?2 

To  prevent  undirected  cycles  (directed  cycles  are  prevented  by  the  meets  restriction), 
out-regions  are  not  permitted  to  join  to  out-regions.  In-regions  can  join  with  both  in¬ 
regions  and  out-regions.  No  temporal  causal  relationships  can  extend,  however,  from 
a  join-region  back  into  an  in-region.  This  prevents  undirected  cycles  by  enforcing  the 

constraint  that  all  inverted  trees  in  an  in-region  must  end  in  the  join-region  (or  not  enter 

the  join-region). 

Definition  23.  A  generalized  temporal  polytree  ( GTP)  is  a  probabilistic  temporal  network 
P  —  (i?,  E)  for  which  there  exist  sets  I  (in-regions),  O  (out-regions),  and  J  (join-regions) 
such  that 

1,71  =  {m}  for  each  E  E,  i.e.,  meets  is  the  only  temporal  relation 

allowed. 

2.  Each  TA  in  the  PTN  is  in  some  in-  or  out-region  and  vice  versa. 

3.  Each  join-region  in  J  connects  two  in-regions  or  connects  an  in-region  with  an  out- 
region.  Out-regions  can  not  join  with  other  out-regions. 

4-  For  each  TCR,  (7 Z,M,X,Y)  6  E,  exactly  one  of  the  following  must  hold: 

(a)  there  exists  one  and  only  one  r  £  Ju  O  such  that  X,Y  E  r,  or 

(b)  there  exists  a  j  E  J  such  that  X,Y  E  j,  or 

(c)  there  exists  a  j  E  J  and  an  o  E  O  such  that  X  E  j  and  Y  E  o. 
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In  no  case  can  X  be  in  a  join-region  and  Y  be  in  an  in-region  outside  of  the  join. 
Theorem  4.  The  expanded  probabilistic  temporal  network  of  any  generalized  temporal 
polytree  is  a  polytree. 

Proof.  By  Contradiction.  Let  P  =  ( R,E )  be  some  generalized  temporal  polytree.  Let 
N  be  the  EPTN  of  P.  Assuming  that  N  is  not  a  polytree,  we  know  by  definition  that 
there  exists  at  least  two  nodes  such  that  two  unique  undirected  paths  exist  between  them. 
These  two  paths  form  an  undirected  cycle.  Based  on  Theorem  3  and  Corollary  1,  there 
can  not  exist  more  than  one  unique  path  between  any  two  nodes  within  any  give  in-  or 
out-region.  Also,  different  regions  can  only  connect  together  in  join-regions.  Thus  at  least 
two  nodes  on  the  undirected  cycle  must  be  in  the  join-region.  Let  these  two  nodes  be  a 
and  b.  Since  all  nodes  in  the  join-region  belong  to  both  in-  or  out-  regions  and  no  cycles 
can  exist  within  any  single  in-  or  out-  regions,  at  least  one  node  on  the  cycle,  say  c,  must 
exist  outside  of  the  join-region.  This  leads  to  two  cases:  either  c  is  in  an  in-region  or  c 
is  in  an  out-region.  Either  way  if  c  is  in  one  region  and  a  and  b  in  the  join-region,  there 
must  be  a  fourth  node,  d ,  in  the  other  region  from  c,  otherwise  the  cycle  would  lie  entirely 
within  one  in-  or  out-region. 


Figure  4.1  (Im)possible  shape  of  an  undirected  cycle  in  a  generalized  temporal  polytree. 

This  gives  us  four  nodes  on  our  cycle,  a,  b,  c,  and  d.  We  know  that  a  and  b  are  both 
in  the  join-region  and  we  know  that  both  c  and  d  are  outside  of  the  join-region  and  each  in 
different  regions.  This  gives  us  a  structure  as  in  Figure  4.1.  Since  out-regions  can  not  join 
to  out-regions,  either  node  d  or  node  c  must  lie  in  an  in-region.  Let  us  assume  that  this 
is  node  d.  Since  a  TCR  can  not  extend  from  the  join-region  out  into  an  in-region,  a  TCR 
must  extend  from  the  TA  containing  d  into  the  join-region.  This  TCR  must  be  such  that 
the  interval  associated  with  d  meets  two  nodes  in  the  join-region,  however  since  all  nodes 
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in  the  join-region  are  also  in  the  same  in-region  as  d,  no  two  nodes  in  the  join-region  can 
have  the  same  start  point  and  thus  d  can  not  meet  these  two  nodes  and  thus  an  undirected 
cycle  can  not  exist.  d 

Although  not  stated  in  the  formal  definition  of  the  generalized  temporal  polytree, 
interval  start-points  in  evidence  TAs  and  end-points  in  leaf  TAs  do  not  need  to  be  different 
from  other  start-  or  end-points  as  evidence  nodes  are  not  dependent  on  anything  and 
nothing  is  dependent  on  leaf  nodes. 


Figure  4.2  A  Generalized  Temporal  Poly  tree  depicting  a  program  execution  scenario. 

Figure  4.2  shows  a  GTP  modeling  a  program  execution  scenario.  Program-A  exe¬ 
cutes  Program-B  to  complete  Task-A.  Program-C  must  complete  Task-B.  Task-B,  how¬ 
ever,  requires  that  Task-A  complete  immediately  prior.  The  start  and  task  TAs  form  an 
in-region  and  the  task  and  end  TAs  form  an  out-region.  Task-A  and  Task-B  together  form 
a  join-region.  Figure  4.3  shows  the  expanded  probabilistic  temporal  network  for  this  GTP. 


Figure  4.3  The  EPTN  for  the  GTP  in  Figure  4.2. 


V.  Knowledge  Engineering 

The  probabilistic  temporal  network  provides  the  knowledge  engineer  with  a  powerful  tool. 
This  chapter  discusses  techniques  for  applying  the  PTN  to  particular  problems.  The  first 
section  is  focused  on  extending  an  existing  knowledge  base  into  the  temporal  domain. 
In  the  second  section,  further  techniques  appropriate  for  completely  temporal  models  are 
discussed. 

5.1  Extending  a  Bayesian  Network  with  Time 

Probabilistic  temporal  networks  provide  an  easy  migration  from  a  timeless  Bayesian 

representation  to  a  fully  temporal  representation.  For  example,  consider  the  Bayesian 

network  in  Figure  5.1  representing  the  following  scenario: 

Tech  support  is  only  available  if  the  phones  are  working  and  the  support  technician 
has  arrived  at  work.  The  probability  that  the  phones  are  working  is  0.95  and  that  the 
support  technician  has  arrived  is  0.875. 

This  scenario  is  easily  and  adequately  modeled  with  a  Bayesian  network.  Suppose  that  we 

also  have  the  following  additional  requirement: 

The  support  tech  has  a  fifty  percent  chance  of  starting  work  between  7:15am  and 
7:45am,  25  percent  chance  between  7:45am  and  8:15am,  and  a  12.5  percent  chance 
between  8:15am  and  8:45am.  If  the  tech  is  not  in  by  8:45am,  she  is  not  coming  in  at 
all. 

To  reflect  this  change,  the  Bayesian  network  in  Figure  5.1  would  have  to  be  modified  to 
explicitly  contain  each  of  the  above  three  intervals  with  support-available  being  dependent 
on  all  three.  The  probabilistic  temporal  network  approach  provides  a  cleaner  alternative. 


Figure  5.1  A  Bayesian  network  for  a  simple  tech  support  scenario. 
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P(salPW,  TA)  -  0.95 
P(sal-iPW,  TA)  -  0.0 
P(salPW.-iTA)  -  0.0 
P(sal-iPW,-iTA)  -  0.0 


PW-{([-oo,oo],pw)> 

P(pw)-0.95 


TA  -  {([0715,0745],a),  P(ahTA)  -  0.5 
([0745,08 15], b),  P(bhTA)  -  0.5 
([08 15, 0845), c»  P(chTA)  -  0.5 


SA-{([-oo,oo],sa)> 


({d>,OR) 


Figure  5.2  Tech-support  Probabilistic  Temporal  Network.  The  probabilities  used 
figure  are  the  dependent  probabilities  rather  then  the  break-out  used 
text  description,  e.g.,  (0.5,0.25,0.125)  becomes  (0.5,  0.5,  0.5).  PT  is 
hand  for  PASSTHROUGH. 

5.  LI  Temporalizing  a  Bayesian  Network.  First,  however,  a  technique  must  be 
found  to  provide  a  temporal  binding  for  the  nodes  in  the  Bayesian  network.  Since  BNs 
usually  do  not  contain  explicit  temporal  information,  we  represent  the  nodes  in  Bayesian 
networks  as  temporal  aggregates  defined  over  a  single  interval  from  negative  infinity  to 
positive  infinity1.  Thus  each  RV  X  with  states  E  from  a  BN  becomes  the  TA  X  = 
({([— oo,  oo],z)},  E).  Then,  under  the  assumption  that  [—00,00]  =  [—00,00],  the  temporal 
relationship  between  TAs  X  and  Y ,  where  an  edge  exists  from  X  to  Y  in  the  BN,  is  simply 
X({=},PASSTHROUGH)Y. 

Then,  if  random  variable  x  in  temporal  aggregate  X  holds  state  cr,  we  can  interpret 
this  to  mean  that  X  holds  state  a  for  all  time 2  (or  at  least  for  the  time  of  discourse).  If 
TA  X  represents  a  boolean  proposition,  then  we  could  instead  interpret  x  =  true  to  mean 
that  at  some  time  our  boolean  proposition  holds  and  if  x  —  false  then  at  no  time  does  the 
proposition  hold.  This  is  the  interpretation  used  in  our  examples  here. 

Notation.  For  convenience ,  a  temporal  aggregate  so  adapted  from  a  Bayesian  network  is 
termed  an  adapted  temporal  aggregate  (ATA). 

1  While  an  open  interval  may  be  more  proper,  the  closed  interval  [—00,00]  is  used  for  consistency  of 
notation.  Practically,  [—00,00]  could  be  replaced  by  any  interval  containing  the  time  of  discourse. 

2Keep  in  mind  that  the  temporal  interval  is  the  primitive  temporal  individual  and  thus  when  we  talk 
about  a  ‘time’  we  are  talking  about  an  interval  and  not  a  point.  If  we  say  11  AM,  an  interval  such  as 
[1100, 1101]  is  implied. 


in  the 
in  the 
short- 
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Figure  5.2  shows  the  PTN  modeling  the  above  scenario.  Since  explicit  temporal 
information  is  provided,  ‘tech- arrived’  is  represented  by  a  temporal  aggregate  defined  over 
three  intervals.  The  edge  between  ‘tech-arrived  ( TA )’  and  ‘support- available  (SA)’  is 
replaced  with  ({ d},OR )  indicating  that  “Support  is  available  if  the  tech  arrives.”  TA 
is  also  dependent  on  itself  with  the  TCR  TA({<,  m},  OR)TA  constraining  TA,  when 
combined  with  the  conditional  probability  tables,  to  be  true  over  only  one  interval. 

Using  this  technique,  any  Bayesian  network  can  be  represented  with  a  probabilistic 
temporal  network!  As  additional  temporal  information  is  gathered,  temporal  aggregates 
can  be  modified  to  contain  the  actual  times  rather  than  [—00,  00].  Semantically,  the  trans¬ 
formation  can  be  awkward  since  the  direction  of  causality  within  BNs  can  raise  implicit 
temporal  constraints.  It  remains  the  task  of  the  knowledge  engineer  to  complete  the  ‘tem- 
poralization’  of  the  model. 

5.1.2  The  time  of  reference.  Something  is  still  missing.  The  network  in  Figure  5.2 
can  tell  us  if  tech  support  is  available  but  we  can’t  tell  when.  In  other  words,  Figure  5.2 
can  answer  the  question  “Is  tech-support  ever  available?”  but  not  the  question  “It  is  12pm. 
Is  tech-support  available  now?”  A  time  of  reference  is  needed. 

As  mentioned  previously,  each  TA  A,  adapted  from  the  original  Bayesian  network, 
can  be  interpreted  as  indicating  if  the  proposition  associated  with  X  holds  at  some  time. 
We  need  a  mechanism  to  determine  what  that  time  is.  Consider  ‘support-available’  and 
‘phones-working’  in  Figure  5.2.  We  could  change  the  interval  from  [—00,00]  to  something 
like  [ t ,  t  +  e]  but  then  our  reasoning  algorithms  and  the  structure  of  the  network  would 
have  to  be  changed  to  constrain  t.  Instead,  we  take  a  different  approach. 

Consider  again  the  original  Bayesian  network  in  Figure  5.1.  If  one  asks  “At  what 
time  is  support  available?”  Intuitively,  we  answer  “When  the  phones  are  working  and 
the  tech  has  arrived.”  If  one  then  asks  “It  is  between  0715  and  0745  and  the  phones  are 
working.  Is  support  now  available?”,  ‘support-available’  is  only  dependent  on  ‘phones- 
working’  and  ‘tech- arrived’  during  the  interval  specified  even  though  this  interval  is  not 
expressed  anywhere  in  the  network. 
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Now  consider  the  PTN  in  Figure  5.2.  If  one  again  asks  “It  is  between  0715  and  0745 
and  the  phones  are  working.  Is  support  now  available?”,  we  can’t  answer  the  questions 
as  ‘support-available’  is  not  dependent  on  only  ‘tech-arrived’  during  [0715,0745]  but  also 
for  [0745,0815]  and  [0815,0845],  Tech  support  is  available  some  time.  If,  however,  ‘tech- 
arrived’  is  forced  to  be  false  after  [0745,0815]  (zero  probability),  then  ‘support-available’ 
is  effectively  no  longer  dependent  on  ‘tech-arrived’  after  [0745,0815].  The  following  calcu¬ 
lations  show  this: 

Let  9  be  the  partial  assignment  for  our  query. 

9  =  {(PW,  {([-oo,  oo],  true)}),  (TA,  {([0745, 0815],  false),  ([0815, 0845],  false)})}.  (5.1) 
Let  7?  be  some  complete  assignment  compatible  with  .  Using  the  chain  rule,  we  derive 


P(tf  \9)  =  P(sa\pw,TA)-P(pw)-P(TA)  (5.2) 

P(TA)  =  P(ORsa|c,  b,  a)  ■  P(c\b,  a)  ■  P(b\a)  ■  P(a)  (5.3) 

Since  we  know  b  =  false,  c  =  false,  and  pw  =  true  we  can  simplify  the  calculation  to 

P(ff)  =  P(sa\pw,TA)  •  P(TA)  (5.4) 

P{TA)  =  P(ORsa|c,  b,  a)  •  P(a)  (5.5) 

and  since  P(ORsa|c,  b,  a)  =  1  if  a  =  true  and  P(ORsa|c,  b,a)  =  0  if  a  =  false,  P(TA )  = 
P(a).  We  can  now  write 

P(^)  =  P(sa\pw,  a)  •  P(a)  (5.6) 

Thus  ‘support-available’  is  only  dependent  on  ‘tech-arrived’  during  [0715,0745]. 


The  idea  of  forcing  falseness  for  future  propositions  is  compatible  with  our  intuitions 
about  causality.  If  we  are  interested  in  the  state  of  the  world  at  present,  it  can  not  be 
dependent  on  what  hasn’t  yet  occurred. 

This  research  does  not,  however,  take  the  approach  of  simply  clamping  the  future 
states  to  false  as  that  approach  only  allows  forward  reasoning.  One  also  wants  to  reason 
backwards,  e.g.,  to  find  the  most  probable  time  for  support  to  be  available.  So  instead,  a 
different  approach  is  taken — introducing  a  new  temporal  aggregate,  Now,  containing  an 
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TA 


{([0700, 0730], a), 
([0730,0800],b)> 
P(al-iTA,-»Now)-0.5 
P(b!-iTA,-iNow)-0.5 


P(salPW,TA,-iGH)  -  0.95 
P(salPW,TA,GH)  -  0.0 
P(salPW,-iTA,-iGH)  -  0.0 
P(salPW,-iTA,GH)  -  0.0 


P(sahPW,TA,^GH) -  0.0 
P(sahPH,TA,GH)-0.0 
P(sal-iPW,-iTA,-iGH)  -  0.0 
P(sahPW,iTA,GH)  -  0.0 


GH  -  {([  1 630,1 700], c), 
([1700, 1730],  d)> 

P(cl-GH,TA,-iNow)-0.5 

P(dl-*GH,TA,-iNow)-1.0 


({<,m>,OR) 


Now 


■  {( [0000,0700], m), 
([0700, 0730], n), 
([0730,0800], o), 
([0800,1 630], p), 


([1630, 1700], q) 
([1700, 1730], r) 
([1730,2400], s)> 


P(ml-tNow)  -  1/20  P(qhNow)  -  1/5 
P(nhNow)  -  1/19  P(rhNow)  -  1/9 
P(ohNow)  -  1/18  P(shNow)  -  1 
P(phNow)  -  6/17 


Figure  5.3  Probabilistic  temporal  network  demonstrating  time  of  reference.  Empirical 
evidence  of  the  density  of  support  calls  is  used  to  assign  probabilities  associ¬ 
ated  with  intervals  in  Now. 

interval  for  each  time  of  interest.  By  making  other  (non  adapted)  temporal  aggregates 
dependent  on  Now  in  such  a  way  that  a  given  TA  can  not  be  true  over  an  interval  unless 
Now  is  false  for  all  earlier  intervals,  we  can  use  Now  to  block  future  events.  If  the  time 
of  reference  is  known,  Now  can  be  clamped  to  true  for  that  interval,  preventing  TAs  from 
being  true  at  any  time  after  the  current  time.  Also,  if  some  state  of  the  world  is  clamped, 
then  belief  updating  can  be  used  to  determine  what  the  most  probable  time  is. 

Figure  5.3  models  the  following  scenario  using  the  Now  construct. 

Tech  support  is  only  available  if  the  phones  are  working  and  the  support  tech¬ 
nician  has  arrived  at  work  and  is  not  at  lunch.  The  phones  almost  always 
work.  The  support  tech  has  a  fifty  percent  chance  of  arriving  between  07:00 
and  07:30  and  a  25  percent  chance  between  07:30  and  08:00.  If  the  tech  is  not 
in  by  8:00am,  she  is  not  coming  in  at  all.  The  tech  has  a  fifty  percent  chance  of 
going  home  between  16:30  and  17:00  and  a  fifty  percent  chance  between  17:00 
and  17:30,  given  that  she  comes  in  at  all. 

The  Now  temporal  aggregate  allows  what-if  queries  where  Now  is  unclamped  to 
predict  at  which  time  events  are  most  likely  to  happen  as  well  as  queries  where  the  time  is 
clamped  to  determine  the  most  probable  state  of  the  system.  Different  strategies  can  be 
used  for  the  conditional  probabilities  in  Now.  The  PTN  in  Figure  5.3  uses  probabilities 
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Figure  5.4  A  simple  network  using  a  delta  TA.  Event  A  can  occur  over  many  different 
intervals,  however  B  is  only  dependent  on  A  if  A  occurs  over  at  least  30 
minutes. 

for  Now  representing  the  density  of  support  calls.  Since  Now  is  weighted  by  call  density, 
we  can  make  queries  such  as  “What  is  the  most  likely  time  for  a  client  to  call,  but  support 
not  be  available?”  It  does  not, however,  allow  meaningful  predictive  queries  of  the  form 
“Given  that  it  is  before  0700,  when  is  the  most  likely  time  for  the  tech  to  go  home?”  since 
if  the  time  of  reference  is  given,  the  future  is  clamped  to  false. 

5.1.3  Temporally  quantified  causation.  It  is  straightforward  to  model  that  cause 
must  precede  effect.  It  is  not,  however,  straightforward  to  model  by  how  much  the  cause 
must  proceed  the  effect.  With  the  thirteen  basic  interval-interval  relations  there  is  no  direct 
way  to  quantify  the  temporal  distance  between  cause  and  effect.  Should  our  representation 
support  temporally  remote  causation  at  all?  Patrick  Suppes,  in  his  A  probabilistic  theory  of 
causality  strongly  rejects  the  concept:  “There  is  almost  a  feeling  of  ludicrousness  in  the  idea 
of  one  body  acting  on  another  at  a  slow  and  leisurely  pace  from  remote  time  and  space.  [31]” 
In  principle  the  author  concurs  with  this  philosophy,  however,  in  practice  holds  that  the 
infinity  of  causes  and  effects  lying  between  a  remote  cause  and  the  resulting  effect,  can  not 
be  represented  efficiently  in  a  computational  model.  These  myriad  underlying  mechanisms 
are  merely  another  source  of  uncertainty. 

In  a  probabilistic  temporal  network,  to  model  that  one  process  relates  to  another,  an 
edge  is  created  from  the  causal  TA  to  the  effect  TA  with  a  temporal  causal  relationship 
containing  a  set  of  temporal  relations  and  a  random  variable  schema,  ©ne  could  create  new 
random  variable  schemas  such  as  DELTA-OR  such  that  the  induced  random  variables  are 
independent  of  those  intervals  in  the  causal  process  that  do  not  satisfy  the  additional  quan- 
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P(salLD)  -  0.0  P(sahLD)  -  1 .0 


Figure  5.5  Probabilistic  temporal  network  modeling  a  tech-support  representative  going 
to  lunch  and  maybe  not  returning  to  work. 

tified  constraints  but  do  satisfy  the  temporal  relations.  The  problem  with  this  approach 
is  that  many  different  RV  schemas  would  be  needed  (for  duration,  overlap,  precedes,  etc.). 
Instead  of  creating  new  schemas,  simply  introduce  a  new  temporal  aggregate  between  the 
cause  and  effect  to  enforce  the  quantification. 

The  new  temporal  aggregate,  termed  a  delta  TA,  lies  between  the  the  cause  and  effect 
TAs.  The  delta  TA  contains  a  set  of  intervals  enforcing  the  quantification.  A  TCR  from 
the  cause  TA  to  the  delta  TA  selects  the  appropriate  intervals  in  the  cause  TA.  A  TCR 
from  the  delta  TA  to  the  effect  TA  passes  on  the  causal  information.  Figure  5.4  shows  a 
simple  example  modeling  an  old  car  starting  on  a  winter  morning.  The  car  must  have  been 
started  at  least  twenty-five  minutes  before  it  can  start  moving.  In  this  simple  example  the 
delta  TA  is  named  £Getting  Warm’.  Figure  5.5,  modeling  the  following  scenario,  shows  an 
application  of  the  delta  TA  to  our  tech  support  realm  in  which  lunch  is  always  30  minutes. 
The  delta  TA  is  a  useful  tool  for  designing  probabilistic  temporal  networks  in  general,  not 
just  for  extending  Bayesian  networks. 

Tech  support  is  only  available  if  the  support  technician  is  not  at  lunch.  The  tech 
always  goes  to  lunch  with  equal  likelihood  at  11:00am,  11:15am,  or  11:30am. 
Lunch  lasts  exactly  one  half-hour.  Going  to  lunch  later  in  the  day  slightly 
increases  the  chances  that  the  tech  will  not  return  to  work.  In  particular  we  are 
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interested  in  times  10:50am  (or  earlier),  11:10am,  11:20am,  11:40am,  11:50am, 

and  12:10am  (or  later). 

5.2  Some  Guidelines  for  Building  a  Temporal  Knowledge  Base 

Using  PTNs  to  extend  an  existing  knowledge  base  with  explicit  temporal  information 
is  effective,  however,  the  disadvantages  of  carrying  along  the  atemporal  semantics  of  BNs 
are  significant.  These  disadvantages  include  the  necessity  of  constructs  such  as  Now  to 
provide  temporal  reference.  Ultimately,  the  knowledge  base  should  be  built  ground  up  with 
explicit  temporal  information.  This  section  briefly  presents  a  few  guidelines  for  developing 
probabilistic  temporal  networks. 

5.2.1  General  Guidelines.  Unlike  Bayesian  networks,  PTNs  allow  cycles.  Cy¬ 
cles  are  very  important  for  representing  recurrence,  periodicity,  and  endogenous  change, 
however,  they  can  be  a  two-edged  sword  as  they  introduce  the  need  to  avoid  cycles  in  the 
underlying  probability  structure.  By  only  using  a  monotonic  set  of  temporal  relations, 
the  need  to  check  for  cycles  can  be  avoided.  Furthermore,  by  using  using  the  causal  set, 
C  =  {<,o,s,fi,di,m},  one  gets  the  added  benefit  of  temporal  consistency,  i.e.,  causality 
only  extends  forwards  in  time.  Networks  restricted  to  just  C  are  termed  Causal  PTNs. 

While  philosophically  debatable,  it  is  often  necessary  to  represent  simultaneity  in 
practical  systems.  In  point  based  temporal  models,  simultaneity  is  represented  with  the 
equals  relation.  This  is  also  true  for  interval  models.  CPTNs,  though,  do  not  allow  ‘=.’ 
Since  cyclic  dependencies  arise  when  a  TA  becomes  dependent  on  itself  over  some  interval, 
we  can  allow  ‘=’  as  long  as  for  every  cycle  in  the  CPTN,  at  least  one  TCR  on  the  cycle, 
does  not  use  *=.’  Such  a  network  is  termed  a  S-Causal  PTN.  SCPTNs  require  that  cycles 
in  the  PTN  must  be  checked;  however,  this  check  is  much  simpler  then  that  required  for 
PTNs  in  general. 

Inference  over  probabilistic  temporal  networks  is  .VP-hard  [8,22].  This  constrains 
the  size  of  networks  that  can  be  reasoned  with.  The  generalized  temporal  polytree  defines 
a  class  of  PTNs  for  which  inference  is  polynomial.  These  types  of  networks  are  useful  for 
modeling  systems  in  which  can  be  grouped  temporally  by  starting,  working,  and  finishing. 
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Figure  4.2  shows  an  example  in  which  three  concurrent  programs  are  being  executed  to 
complete  two  tasks. 

5.2.2  Processes  and  Events.  Intervals  can  model  both  processes3  and  events  [3]. 
Processes  are  generally  described  by  ‘ing’  words  such  as  walking  and  talking.  If  ([a,  b\,w) 
is  an  interval-RV  pair  for  a  process  such  as  ‘tech  arriving  at  work,’  then  w  is  true  implies 
that  ‘arriving  at  work’  holds  for  all  intervals  contained  in  [a,  b ]  also.  An  event,  on  the  other 
hand,  does  not  hold  for  all  sub-intervals.  Consider  ‘tech  arrived  at  work,’  again  represented 
by  ([a,  b],  w).  Just  because  w  is  true,  we  can  not  assert  that  the  tech  arrived  at  work  during 
some  subinterval  of  [a,  b].  Our  model  does  not  explicitly  differentiate  between  events  and 
processes.  The  knowledge  engineer  can  represent  either. 

Often,  the  exact  interval  that  an  event  occurs  in  is  not  known.  In  this  case,  the 
interval-RV  pairs  in  a  temporal  aggregate  represent  intervals  during  which  the  event  may 
take  place.  In  other  words  the  interval  encapsulates  small  scale  uncertainty  that  is  not 
important  to  the  situation  being  modeled. 

5.2.3  Mutual  Exclusion.  Many  situations  contain  events  that  are  not  recurrent, 
i.e.,  they  either  do  not  happen,  or  they  happen  exactly  once.  For  example,  consider  a 
light-bulb  that  may  burn  out  sometime  in  the  scenario.  The  bulb  can  burn  out  only  once 
(no  replacement),  and  may  not  burn  out  at  all.  Such  events  are  referred  to  as  one-shots 
and  can  easily  be  represented  with  a  temporal  aggregate  with  a  single,  self  dependent, 
temporal  causal  relationship. 

To  construct  the  temporal  aggregate,  we  need  to  decide  on  S,  the  set  of  states,  and 
on  T,  the  set  of  intervals  we  are  interested  in.  Since  we  are  modeling  something  that  can 
either  happen  or  not,  we  have  only  two  states.  Let  true  indicate  that  the  event  happens 
during  the  interval  and  false  indicate  that  the  event  does  not  happen.  How  many  intervals 
are  needed?  This  depends  on  the  resolution  needed  to  model  the  situation;  we  will  use  three 
consecutive  interval-RV  pairs,  {([a,  6],  ri),  ([6,  c],  r2),  ([c,  d], r3)},  representing  the  intervals 

3Processes  is  not  used  here  in  the  sense  of  what  a  temporal  aggregate  represents. 
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Figure  5.6  Probabilistic  temporal  network  modeling  a  light-bulb.  Demonstrates  mutual 
exclusion  relationship  between  intervals. 

during  which  the  light  might  burn-out.  There  is  no  requirement  that  these  intervals  be 
consecutive,  they  can  overlap  (see  Figure  3.3)  or  be  disjoint. 

How  probable  is  it  the  bulb  will  burn  out?  If  it  is  certain  that  the  bulb  will  burn  out 
(during  the  scenario)  then  the  sum  of  the  independent  probabilities  must  be  one,  in  any 
case  P(ri)  +  P(r 2)  +  P(r^)  <  1.  However,  since  there  is  a  mutual  exclusion  relationship 
between  the  intervals,  ri,  7*2,  and  r$  are  not  independent  random  variables.  Instead  we 
must  order  the  interval-RV  pairs  and  make  each  pair  dependent  on  all  prior  pairs.  As 
far  as  the  probabilities  are  concerned,  there  is  no  preference  for  the  ordering,  however, 
semantically,  the  ordering  should  be  from  ‘earliest’  to  ‘latest’  where  ‘earliest’  might  be 
defined  by  the  causal  set,  C,  of  temporal  relations  (see  Proposition  1  on  Page  3-15).  For 
our  problem,  7*3  is  dependent  on  7*2  and  r  1,  r 2  is  dependent  on  r  1,  and  r\  is  dependent  on 
nothing. 

The  next  step  is  to  convert  our  independent  probabilities  to  conditional  probabilities. 
Let  us  assume  that  there  is  a  10%  chance  that  the  bulb  will  burn  out  during  each  of  the 
intervals,  e.g.,  P(rx)  =  0.10.  Clearly,  since  r\  is  dependent  on  nothing,  P(r\)  =  1/10. 
This  leaves  90%  left.  7*2  will  take  the  next  10%  or  10/90,  so  P^l^i)  =  1/9.  This  leaves 
80%  for  7-3  so  P(r^\r  1^2)  =  10/80  =  1/8.  If  the  bulb  burning-out  was  certain,  that  is 
P(ri)  +  P(r,2)  +  P(r3)  =  1,  then  the  conditional  probability  for  the  ‘latest’  RV  would  be  1. 

We  have  now  defined  our  temporal  aggregate  item  BO  =  (T,  S)  such  that 

T  =  {([o,6],r1),([6,c],r2),([c,d],r3)}  and  E  =  {true,  false}  (5.7) 
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and  computed  our  conditional  probabilities.  The  next  step  is  to  define  the  temporal  causal 
relationship  to  capture  the  conditional  dependencies.  Since  our  three  intervals  are  consec¬ 
utive,  only  two  temporal  relations  are  needed,  meets  and  precedes.  Since  the  bulb  can  not 
burn-out  in  an  interval  if  it  burned-out  in  any  prior  interval,  we  can  use  the  OR  schema. 
This  gives  our  TCR  as 

BO({<,m},OR)BO  (5.8) 

and  our  final  conditional  probability  tables  as 

P(rih-BO)  -  1/10 

PfahBO)  =  1/9  (5-9) 

P(rshBO)  =  1/8. 

Figure  5.6  shows  the  corresponding  network. 

5.2.4  Sure  Events.  A  sure  event  is  a  fact  that  we  want  to  explicitly  model  in  the 
PTN.  An  example  of  a  sure  event  can  be  seen  in  Figure  3.2  in  which  ‘Critical-Operations’ 
is  shown  defined  over  only  one  interval,  [0855,1805],  with  probability  of  one.  We  could, 
instead,  have  changed  our  conditional  probability  tables  for  ‘Vault-Open’  instead  of  even 
having  the  ‘Critical-Operations’  aggregate. 

The  reason  ‘Critical-Operations’  is  explicitly  modeled  is  twofold.  First,  critical  op¬ 
erations  has  causal  influence  on  ‘Vault-Open’  and,  as  such,  should  be  explicitly  modeled. 
Secondly,  by  explicitly  modeling  ‘Critical-Operations’  we  get  the  added  benefit  of  hav¬ 
ing  only  one  conditional  probability  distribution  which  applies  to  all  interval- RV  pairs  in 
‘Vault-Open’  rather  than  having  a  different  one  for  each  pair. 

Why  is  only  one  interval  needed  in  ‘Critical-Operations?’  Since  when  critical  opera¬ 
tions  are  not  occurring,  we  would  want  the  temporal  aggregate  to  appear  false,  one  would 
expect  the  additional  intervals  [0000, 0855]  and  [1805  -  2400]  each  having  probability  of 
true  set  to  zero.  However,  because  the  OR  schema  is  designed  such  that  if  no  interval  exists 
satisfying  the  temporal  relation,  the  constructed  random  variable  has  a  zero  probability 
of  being  true.  This  property  gives  us  the  advantage  of  minimizing  the  number  of  intervals 
needed  to  express  processes  with  true  and  false  states. 
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VI.  Recommendations  and  Conclusions 

This  chapter  presents  recommendations  for  several  avenues  of  future  research.  None 
of  these  recommendations  are  show  stoppers — the  probabilistic  temporal  network,  as  it 
stands,  is  an  excellent  representation  for  complex,  dynamic  systems.  However,  since  the 
PTN  is  unproven,  the  most  important  step  for  further  effort  is  to  demonstrate  this  excel¬ 
lence  by  implementing  a  real-world,  large-scale  model.  Good  domains  for  this  large-scale 
model  include  security  analysis  and  medical  diagnosis. 

6.1  Recommendations  for  Future  Research 

The  probabilistic  temporal  network  can  represent  very  complicated  and  tradition¬ 
ally  difficult  domains.  This  research  has  focused  on  exploring  recurrence  and  periodicity, 
temporal  spacing  between  cause  and  effect,  and  modeling  the  time-of-reference.  These 
are  traditional  problems  for  temporal  models.  Current  and  future  efforts  are  focused  on 
exploring  these  and  other  knowledge  engineering  issues. 

This  thesis  introduces  a  constraint  satisfaction  formulation  for  performing  belief  re¬ 
vision  (Section  4.2).  This  formulation  needs  to  be  extended  to  perform  belief  updating 
(finding  the  most  likely  state  of  a  given  interval-RV  pair  or  temporal  aggregate).  The 
constraint  set  needs  to  be  enhanced  to  take  better  advantage  of  the  structure  imposed  by 
our  network  structure. 

Performing  belief  revision  is  in  general  ATP-hard.  To  address  this,  the  generalized 
temporal  polytree  was  introduced,  which,  because  of  the  polytree  structure  of  its  depen¬ 
dencies,  allows  polynomial  time  belief  revision.  We  are  currently  investigating  practical 
domains  for  which  the  GTP  is  tenable.  The  question  also  remains  as  to  what  exactly  the 
maximal  tractable  class  of  PTNs  is. 

Overlapping  intervals  in  a  temporal  aggregate  are  troublesome.  The  theory,  as  it 
stands,  allows  overlapping  intervals  so  that  events  happening  over  intervals  can  be  ex¬ 
pressed.  For  example,  if  a  switch  could  be  on  from  1000  to  1030  or  1015  to  1045,  this 
condition  could  be  represented  as  {([1000, 1030],  So),  ([1015, 1045],  Si)}  where  So  and  Si 
are  random  variables  for  the  switches  position.  Si  would  be  conditioned  on  So  to  prevent 
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the  switch  from  being  on  over  both  intervals.  The  problem  arises  in  that  now  the  switch 
could  be  considered  both  on  and  off  in  the  interval  [1015,1030].  Originally,  this  wasn’t 
considered  a  problem  as  the  temporal  causal  relation  (TCR)  resolved  any  ambiguity  from 
the  perspective  of  the  caused  process.  One  possibility  is  to  make  the  interval  itself  ran¬ 
dom.  For  example  {([1000, 1030],  So),  ([1015, 1045],  Si)}  might  become  {(/,  On)}  where 
P(I  =  [1000, 1030])  =  P(So  \ . .  •)  and  P(I  —  [1015, 1045])  =  P(Si  \ . . .).  This  solution  gets 
us  to  only  one  interval;  however,  there  are  now  two  sorts  of  probabilities  to  deal  with  when 
doing  computation. 

Most  work  to  date  has  been  within  the  discrete  realm.  Future  research  will  focus 
on  modeling  continuous  domains.  Using  continuous,  rather  than  discrete,  sets  of  states 
(S)  in  temporal  aggregates  is  straightforward.  For  example,  we  might  have  a  TA,  Temp  — 
(T Tjt)  where  T^  =  {([0000, 0100],  tj), . . . ,  ([2300,  2400],  *24)}  and  Ep  =  5J.  Temp  models 
changes  in  the  peak  temperature  over  the  course  of  a  day.  We  could  have  a  second  TA, 
NitDay  =  (TN^N)  where  TN  =  {([0000,  0700], m),  ([0700  -  1900],  n2),  ([1900,  2400],  n3) 
and  Sjv  =  {night,  day}.  With  these  two  TAs,  we  would  like  to  model  peak  temperature 
changing  over  the  course  of  the  day.  Temperature  during  a  given  hour  is  dependent  on 
whether  or  not  it  is  day  or  night,  on  the  temperature  during  the  previous  hour,  and  on 
the  rate  of  change  between  the  previous  two  hours.  Constructing  the  network  structure  is 
trivial  (see  Figure  6.1). 


Figure  6.1  A  probabilistic  temporal  network  modeling  peak  temperature  changing  over 
the  course  of  a  day. 
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The  difficulty  arises  in  developing  appropriate  continuous  distribution  functions  for 
the  domain  and  representing  the  causal  connection  between  processes  (developing  appropri¬ 
ate  random  variable  schema)  as  well  as  the  conditional  dependencies  in  the  caused  process. 
Also,  even  with  continuity  in  states,  without  continuity  in  time,  continuous  change  can  not 
truly  be  represented.  A  potential  approach  is  to  use  a  structure  similar  to  the  one  discussed 
for  dealing  with  overlapping  intervals  in  which  a  continuous  density  function  is  used  to  give 
the  probability  distribution  over  the  temporal  interval  space. 

Among  the  avenues  for  further  research  discussed  here,  modeling  continuous  change 
is  perhaps  the  most  interesting.  In  a  sense,  being  able  to  represent  continuity  would 
“complete”  the  probabilistic  temporal  network  model,  allowing  the  model  to  fully  represent 
natural  systems. 

6.2  Conclusion 

The  research,  presented  here,  develops  a  new  knowledge  representation  unique  in  its 
ability  to  represent  both  time  and  uncertainty.  The  technique,  the  probabilistic  tempo¬ 
ral  network,  draws  from  the  independence  semantics  of  Bayesian  networks  and  from  the 
temporal  representation  in  the  interval  algebra.  The  proven  probabilistic  nature  allows 
knowledge  engineers  to  drawn  on  previously  developed  statistical  data  as  well  as  the  entire 
field  of  probability  theory.  This  property  is  crucial  for  developing  well  defined,  non  ad  hoc 
models. 

By  directly  representing  processes  as  temporal  aggregates  and  modeling  the  causal 
relationships  between  the  processes  with  temporal  causal  relationships,  complex  systems 
of  interacting  processes  can  be  modeled.  Being  able  to  model  such  systems  is  crucial  to 
successfully  automating  domains  such  as  medical  diagnosis,  story  understanding,  planning 
and  scheduling,  and  financial  forecasting.  Mastery  of  these  and  related  domains,  such  as 
security  analysis  and  combat  modeling,  is  crucial  for  the  continued  success  of  the  United 
States  Air  Force.  These  domains  all  share  in  common  the  need  to  reason  with  both  time 
and  uncertainty — the  domain  of  the  probabilistic  temporal  network. 
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